This means that we won't complain to peers which gossip about our
channels, but it does mean that our channel graph (like other nodes on
the network) will show two channels, not one, for the duration.
For this reason, we need askrene to omit local dying channels.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We restart the nodeL if the coin_movements.py plugin hasn't processed the
notification yet, it will be incorrect:
```
> assert account_balance(l2, chanid_1) == 100001001
E AssertionError: assert 150_001_001msat == 100_001_001
E + where 150001001msat = account_balance(<fixtures.LightningNode object at 0x7f0634e1eb00>, '39ac52c818c5304cf0664940ff236c4e3f8f4ceb8993cb1491347142d61b62bc')
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. It was flaky, probably because it didn't wait for the remote update_channel.
2. Rusty applied a fix in 5f664dac77, not clear if it worked.
3. Christian disabled it altogether in 23ce9a947d.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since this was written, we now test if remote side would get into this situation and stop
it from happening, so the test doesn't work any more.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We can still get a warning:
lightningd-1 2025-12-10T01:11:07.232Z DEBUG 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-connectd: Received WIRE_WARNING: WARNING: channel_announcement: no unspent txout 109x1x1
This has nothing to do with l1 talking about the original channel
(which would be 103x1x): it's because l2's gossipd (being the node
which does the splice) immediately forgets the pre-splice id. If l1
sends some gossip, it will get a warning message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We used to just run these without valgrind, but we already run them in
CI (which sets SLOW_MACHINE) without valgrind, so this just doubles
up.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When paying through a direct channel, direct_pay_override() creates a
route bypassing the normal routing path, which skips the CLTV budget
check in payment_getroute(). This allows payments to succeed even when
maxdelay is set below the required min_final_cltv_expiry.
Add a check in direct_pay_override() to verify the required CLTV
doesn't exceed cltv_budget before using the direct channel shortcut.
If it exceeds, skip the direct channel and let normal routing handle
the failure with a proper error message.
Fixes: #8609
Changelog-Fixed: pay: `maxdelay` parameter now enforced for direct channel payments
We had the test backwards, so we moved it *all the time*. This bloats our gossip store, as well as
not moving it in the case where we need to.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: gossipd: we would occasionally not show a node announcement in listnodes().
At startup, we load the outpoints to watch, *then* roll back 15
blocks. If there were things in those blocks we wanted to watch, we
no longer do!
1. We load the utxoset into memory: everything in the utxoset table
which has spendheight null.
2. We roll back 15 blocks to re-read. Deleting a block from the
database causes the utxo spentheights referring to it to be set
to null.
3. We roll forward, but we didn't update the in-memory utxoset,
so we're not watching those utxos which are spent.
The main symptom of this is that we spam peers with obsolete gossip
(if we get sent a channel announcement for a closed channel, we can
think it isn't spent yet). But it could *also* mean we don't notice
onchain txs, if we restart at the wrong time!
Changelog-Fixed: lightningd: we could miss tx spends which happened in the past blocks when we restarted.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If they had a channel bias, and ran xpay, it will update the bias
to a v2 bias (with a timestamp). We must downgrade that, or the
older version won't load!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: tools: `lightningd-downgrade` can downgrade your database from v25.12 to v25.09 if something goes wrong.
When installed, the name is `lightning-hsmtool`. We actually copy
`tools/hsmtool` to `tools/lightning-hsmtool` but that's a silly step
which we should get rid of.
So:
1. Make sure our documentation always refers to it as lightning-hsmtool.
2. Make sure our tests invoke it as `lightning-hsmtool`.
3. Rename the C file.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We don't expect an internal command to take 5 seconds to service
without explicitly pausing: if it does, log at a higher level.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
When we enter the wrong passphrase hsmd crashes like this with an unknown message type:
lightning_hsmd: Failed to load hsm_secret: Wrong passphrase (version v25.12rc1-7-g7713a42-modded)
0x102ba44bf ???
send_backtrace+0x4f:0
0x102b0900f status_failed
common/status.c:207
0x102af1a37 hsmd_send_init_reply_failure
hsmd/hsmd.c:301
0x102af1497 load_hsm
hsmd/hsmd.c:446
0x102af1497 init_hsm
hsmd/hsmd.c:548
0x102b29e63 next_plan
ccan/ccan/io/io.c:60
0x102b29e63 do_plan
ccan/ccan/io/io.c:422
0x102b29d8b io_ready
ccan/ccan/io/io.c:439
0x102b2b4bf io_loop
ccan/ccan/io/poll.c:470
0x102af0a83 main
hsmd/hsmd.c:886
lightningd: HSM sent unknown message type
This change swaps write_all() to wire_synce_write() because write_all() is missing the wire protocol length prefix. We also don't send a stack trace anymore if the user has entered the wrong passphrase and exit cleanly.
The original method name was lsps-lsps2-invoice but I somehow messed it
up and renamed during a rebase.
Changelog-Changed: lsps-jitchannel is now lsps-lsps2-invoice
Signed-off-by: Peter Neuroth <pet.v.ne@gmail.com>
This is slow, but will make sure we find out if we add latency spikes in future.
tests/test_coinmoves.py::test_generate_coinmoves (5,000,000, sqlite3):
Time (from start to end of l2 node): 223 seconds
Latency min/median/max: 0.0023 / 0.0033 / 0.113 seconds
tests/test_coinmoves.py::test_generate_coinmoves (5,000,000, Postgres):
Time (from start to end of l2 node): 470 seconds
Worst latency: 0.0024 / 0.0098 / 0.124 seconds
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: lightningd: multiple signficant speedups for large nodes, especially preventing "freezes" under exceptionally high load.
Changelog-Added: Plugins: "filters" can be specified on the `custommsg` hook to limit what message types the hook will be called for.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
tests/test_coinmoves.py::test_generate_coinmoves (2,000,000, sqlite3):
Time (from start to end of l2 node): 135 seconds **WAS 227**
Worst latency: 12.1 seconds **WAS 62.4**
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now we've rid ourselves of the worst offenders, we can make this a real
stress test. We remove plugin io saving and low-level logging, to avoid
benchmarking testing artifacts.
Here are the results:
tests/test_coinmoves.py::test_generate_coinmoves (2,000,000, sqlite3):
Time (from start to end of l2 node): 518 seconds
Worst latency: 353 seconds
tests/test_coinmoves.py::test_generate_coinmoves (2,000,000, Postgres):
Time (from start to end of l2 node): 417 seconds
Worst latency: 96.6 seconds
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we add a new hook, not at the end, while hooks are getting called,
then iteration could be messed up (e.g. calling a plugin twice, or
skipping one).
The simplest thing is to defer updates until nobody is calling the
hook. In theory this could livelock, in practice it won't.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We start with 100,000 entries. We will scale this to 2M as we fix the
O(N^2) bottlenecks.
I measure the node time after we modify the db, like so:
while guilt push && rm -rf /tmp/ltests* && uv run make -s RUST=0; do RUST=0 VALGRIND=0 TIMEOUT=100 TEST_DEBUG=1 eatmydata uv run pytest -vvv -p no:logging tests/test_coinmoves.py::test_generate_coinmoves > /tmp/`guilt top`-sql 2>&1; done
Then analyzed the results with:
FILE=/tmp/synthetic-data.patch-sql; START=$(grep 'lightningd-2 .* Server started with public key' $FILE | tail -n1 | cut -d\ -f2 | cut -d. -f1); END=$(grep 'lightningd-2 .* JSON-RPC shutdown' $FILE | tail -n1 | cut -d\ -f2 | cut -d. -f1); echo $(( $(date +%s -d $END) - $(date +%s -d $START) )); grep 'E assert' $FILE;
tests/test_coinmoves.py::test_generate_coinmoves (100,000, sqlite3):
Time (from start to end of l2 node): 85 seconds
Worst latency: 75 seconds
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
bc4bb2b0ef "libplugin: use jsonrpc_io logic for sync requests too."
changed this message, and test was not updated.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Delay can cause bogus complaints:
```
2025-11-13T23:50:03.6643632Z lightningd-3 2025-11-13T23:37:29.947Z **BROKEN** 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: wake delay for WIRE_CHANNEL_REESTABLISH: 5708msec
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>