I noticed this in the logs:
```
lightningd-1 2026-01-28T00:27:37.504Z DEBUG gossipd: gossip_store: Read 59428/118856/0/0 cannounce/cupdate/nannounce/delete from store in 45521871 bytes, now 45521849 bytes (populated=true)
lightningd-1 2026-01-28T00:27:37.504Z DEBUG gossipd: Got 118856 bad cupdates, ignoring them (expected on mainnet)
```
That's weird, and turns out it counting good updates, not bad ones!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We reimplemented this redundantly: hash_scid was called
short_channel_id_hash, so I obviously missed it.
Rename, and implement hash_scidd helper too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
gossip_store.c uses this to avoid two reads, and we want to use it
elsewhere too.
Also fix old comment on gossip_store_readhdr().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
gossmap doesn't care, so gossipd currently has to iterate through the
store to find them at startup. Create a callback for gossipd to use
instead.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's used by common/gossip_store.c, which is used by many things other than
gossipd. This file belongs in common.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We also put this in the store_ended message, too: so you can
tell if the equivalent_offset there really refers to this new
entry (or if two or more rewrites have happened).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's actually quite quick to load a cache-hot 308,874,377 byte
gossip_store (normal -Og build), but perf does show time spent
in siphash(), which is a bit overkill here, so drop that:
Before:
Time to load: 66718983-78037766(7.00553e+07+/-2.8e+06)nsec
After:
Time to load: 54510433-57991725(5.61457e+07+/-1e+06)nsec
We could save maybe 10% more by disabling checksums, but having
that assurance is nice.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This was added in 24.05, but LND since 0.18.3 no longer ever creates
such onions, and even that version (September 2024) is now a long way
behind.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Removed: Protocol: we no longer support legacy onions (never sent by LND >= 0.18.3, which was the last)
If we can't decode something, and it decodes as a rune (and all bech32
strings do!), then we would usually just complain it was a malformed
rune. Be a big more useful, when the parameter looks like somthing else.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: JSON-RPC: `decode` is now more informative with malformed strings (won't claim everything is a malformed rune!).
This happens for about 300 channels, from every process that loads the gossmap.
It's not very useful to flood the logs, so just log a summary.
To be fair, on my node, this is only the 11th most common message, so we will
revisit the others too:
```
1589311 DEBUG 02e01367e1d7818a7e9a0e8a52badd5c32615e07568dbe0497b6a47f9bef89d6af-connectd: peer_out WIRE_WARNING
139993 DEBUG lightningd: fixup_scan: block 786151 with 1203 txs
55388 DEBUG plugin-bcli: Log pruned 1001 entries (mem 10508118 -> 10298662)
33000 DEBUG gossipd: Unreasonable timestamp in 0102000a38ec41f9137a5a560dac6effbde059c12cb727344821cbdd4ef46964a4791a0f67cd997499a6062fc8b4284bf1b47a91541fd0e65129505f02e4d08542b16fe28c0ab6f1b372c1a6a246ae63f74f931e8365e15a089c68d61900000000000d9d56000ba40001690fe262010100900000000000000001000003e8000001f30000000000989680
23515 DEBUG hsmd: Client: Received message 14 from client
22269 DEBUG 024b9a1fa8e006f1e3937f65f66c408e6da8e1ca728ea43222a7381df1cc449605-hsmd: Got WIRE_HSMD_ECDH_REQ
14409 DEBUG gossipd: Enqueueing update for announce 0102002f7e4b4deb19947c67292e70cb22f7fac837fa9ee6269393f3c513d0431d52672e7387625856c19299cfd584e1a3f39e0f98df13c99090df9f4d5cca8446776fe28c0ab6f1b372c1a6a246ae63f74f931e8365e15a089c68d61900000000000e216b0008050001692e1c390101009000000000000003e800000000000013880000004526945a00
12534 DEBUG gossipd: Previously-rejected announce for 514127x248x1
===> 12092 DEBUG connectd: Bad cupdate for 641641x1164x1/1, ignoring (delta=80, fee=1073742199/58)
10761 DEBUG 02e01367e1d7818a7e9a0e8a52badd5c32615e07568dbe0497b6a47f9bef89d6af-channeld-chan#70770: Got it!
10761 DEBUG 02e01367e1d7818a7e9a0e8a52badd5c32615e07568dbe0497b6a47f9bef89d6af-channeld-chan#70770: ... , awaiting 1120
10761 DEBUG 02e01367e1d7818a7e9a0e8a52badd5c32615e07568dbe0497b6a47f9bef89d6af-channeld-chan#70770: Sending master 1020
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I tried removing sign_penalty_to_us, but that comment is wrong: channeld
uses that for the watchtower, so it stays (with updated comment).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
```
common/test/run-param.c:381:8: runtime error: applying zero offset to null pointer
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior common/test/run-param.c:381:8
```
Probably because CI now on 24.04, so more recent clang. But the test really does
want to see what happens when the callback is NULL, so workaround.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We would keep parsing if we were out of tokens, even if we had actually
finished one object!
These are comparison against the "xpay: use filtering on rpc_command
so we only get called on "pay"." not the disasterous previous one!
tests/test_coinmoves.py::test_generate_coinmoves (2,000,000, sqlite3):
Time (from start to end of l2 node): 126 seconds (was 135)
Worst latency: 5.1 seconds **WAS 12.1**
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
A client can do this by sending a large request, so this allows us to see what
happens if they do that, even though 1MB (2MB buffer) is more than we need.
This drives our performance through the floor: see next patch which gets
us back on track.
tests/test_coinmoves.py::test_generate_coinmoves (2,000,000, sqlite3):
Time (from start to end of l2 node): 271 seconds **WAS 135**
Worst latency: 105 seconds **WAS 12.1**
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This potentially saves us some reads (not measurably though), at cost
of less fairness. It's important to measure though, because a single
large request will increase buffer size for successive requests, so we
can see this pattern in real usage.
tests/test_coinmoves.py::test_generate_coinmoves (2,000,000, sqlite3):
Time (from start to end of l2 node): 227 seconds (was 239)
Worst latency: 62.4 seconds (was 56.9)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Profiling shows us spending all our time in tal_arr_remove when dealing
with a giant number of output streams. This applies both for RPC output
and plugin output.
Use linked list instead.
tests/test_coinmoves.py::test_generate_coinmoves (2,000,000, sqlite3):
Time (from start to end of l2 node): 239 seconds **WAS 518**
Worst latency: 56.9 seconds **WAS 353**
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we only have 8 or fewer spans at once (as is the normal case), don't
do allocation, which might interfere with tracing.
This doesn't change our test_generate_coinmoves() benchmark.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If we have USDT compiled in, scanning the array of spans becomes
prohibitive if we have really large numbers of requests. In the
bookkeeper code, when catching up with 1.6M channel events, this
became clear in profiling.
Use a hash table instead.
Before:
tests/test_coinmoves.py::test_generate_coinmoves (100,000, sqlite3):
Time (from start to end of l2 node): 269 seconds (vs 14 with HAVE_USDT=0)
Worst latency: 4.0 seconds
After:
tests/test_coinmoves.py::test_generate_coinmoves (100,000, sqlite3):
Time (from start to end of l2 node): 14 seconds
Worst latency: 4.3 seconds
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Somehow I missed this when deprecating `short_channel_id` being null.
Changelog-Deprecated: Plugins: `channel_state_changed` notification `message` field being `null`: it will be omitted instead.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I added amount_msat_accumulate for the "a+=b" case, but I was struggling
with a name for the subtractive equivalent. After some prompting, ChatGPT
suggested deduct.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Only in developer mode, ofc.
Notes:
1. We have to move the initialization before the lightningd main trace_start,
since that uses pseudorand().
2. To make the results stable, we need to use per-caller values to randbytes().
Otherwise external timing changes the call order.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They are invalid! This is because our BOLT11_FIELD_BYTE_LIMIT is not the limit,
it's one greater than the limit.
Reported-by: https://github.com/noblepayne
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: JSON-RPC: `invoice` no longer accepts 640-byte descriptions (it would produce malformed invoices).
We call it once at the end, but calling on each allocation is
excessive, and it shows when dealing with large PSBTS. Testing a
700-input PSBT was unusably slow without this: after this the entire
test ran in 9 seconds.
Changelog-Fixed: JSON-RPC: Dealing with giant PSBTs (700 inputs!) is now much faster.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
No idea if it works, we don't test it and nobody runs it. I guess not.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Removed: Config: non-functioning litecoin support (who knew we even had that?)
Instead of having a separate field to derive the bip86 base key, we return it in the hsmd init reply once we know that the hsm_secret is of mnemonic type
Add TLV field to hsmd_init_reply_v4 to communicate the HSM secret type
(mnemonic vs legacy) from HSM to lightningd. This allows lightningd to
automatically determine whether to use BIP86 or BIP32 derivation without
needing separate address types.