palladum-lightning

Author	SHA1	Message	Date
Rusty Russell	acb8a8cc15	gossipd: dev-compact-gossip-store to manually invoke compaction. And tests! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	88f3f97b7c	gossipd: reset dying_channels array after compact. Reported-by: @daywalker90 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	912b40aeff	gossipd: compact when gossip store is 80% deleted records. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Added: `gossipd` now uses a `lightning_gossip_compactd` helper to compact the gossip_store on demand, keeping it under about 210MB.	2026-02-16 17:23:33 +10:30
Rusty Russell	15696d97bd	gossipd: code to invoke compactd and reopen store. This isn't called anywhere yet. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	f56f8adcdf	gossipd: lightningd/lightning_gossip_compactd A new subprocess run by gossipd to create a compacted gossip store. It's pretty simple: a linear compaction of the file. Once it's done the amount it was told to, then gossipd waits until it completes the last bit. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	1fb4da075f	gossipd: put the last_writes array inside struct gossip_store. This is the file responsible for all the writing, so it should be responsible for the rewriting if necessary (rather than gossmap_manage). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	445bcd040a	gossipd: don't compact on startup. We now only need to walk it if we're doing an upgrade. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Changed: `gossipd` no longer compacts gossip_store on startup (improving start times significantly).	2026-02-16 17:23:33 +10:30
Rusty Russell	dfc4ce21de	gossipd: don't gather dying channels during compaction. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	900fd08455	gossipd: use gossmap to load the dying entries. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	1ad8ca9603	gossmap: add callback for gossipd to see dying messages. gossmap doesn't care, so gossipd currently has to iterate through the store to find them at startup. Create a callback for gossipd to use instead. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	e8fd235d4e	common: move gossip_store_wire.csv into common/ from gossipd/ It's used by common/gossip_store.c, which is used by many things other than gossipd. This file belongs in common. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	5dcf39867c	gossipd: write uuid record on startup. This is the first record, and ignored by everything else. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	b1055aa0ac	gossip_store: add UUID entry at front of the store. We also put this in the store_ended message, too: so you can tell if the equivalent_offset there really refers to this new entry (or if two or more rewrites have happened). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	ae957161e6	pytest: fix bogus test_gossip_store_compact_noappend test. It didn't do anything, since the dev_compact_gossip_store command was removed. When we make it do something, it crashes since old_len is 0: ``` gossipd: gossip_store_compact: bad version gossipd: FATAL SIGNAL 6 (version v25.12rc3-1-g9e6c715-modded) ... gossipd: backtrace: ./stdlib/abort.c:79 (__GI_abort) 0x7119bd8288fe gossipd: backtrace: ./assert/assert.c:96 (__assert_fail_base) 0x7119bd82881a gossipd: backtrace: ./assert/assert.c:105 (__assert_fail) 0x7119bd83b516 gossipd: backtrace: gossipd/gossip_store.c:52 (append_msg) 0x56294de240eb gossipd: backtrace: gossipd/gossip_store.c:358 (gossip_store_compact) 0x56294 gossipd: backtrace: gossipd/gossip_store.c:395 (gossip_store_new) 0x56294de24 gossipd: backtrace: gossipd/gossmap_manage.c:455 (setup_gossmap) 0x56294de255 gossipd: backtrace: gossipd/gossmap_manage.c:488 (gossmap_manage_new) 0x56294 gossipd: backtrace: gossipd/gossipd.c:400 (gossip_init) 0x56294de22de9 ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	e383e14cb3	connectd: don't be That Node when someone is gossipping crap. As seen in my logs, we complain about nodes a lot (Hi old CLN!). ``` ===>1589311 DEBUG 02e01367e1d7818a7e9a0e8a52badd5c32615e07568dbe0497b6a47f9bef89d6af-connectd: peer_out WIRE_WARNING 139993 DEBUG lightningd: fixup_scan: block 786151 with 1203 txs 55388 DEBUG plugin-bcli: Log pruned 1001 entries (mem 10508118 -> 10298662) 33000 DEBUG gossipd: Unreasonable timestamp in 0102000a38ec41f9137a5a560dac6effbde059c12cb727344821cbdd4ef46964a4791a0f67cd997499a6062fc8b4284bf1b47a91541fd0e65129505f02e4d08542b16fe28c0ab6f1b372c1a6a246ae63f74f931e8365e15a089c68d61900000000000d9d56000ba40001690fe262010100900000000000000001000003e8000001f30000000000989680 23515 DEBUG hsmd: Client: Received message 14 from client 22269 DEBUG 024b9a1fa8e006f1e3937f65f66c408e6da8e1ca728ea43222a7381df1cc449605-hsmd: Got WIRE_HSMD_ECDH_REQ 14409 DEBUG gossipd: Enqueueing update for announce 0102002f7e4b4deb19947c67292e70cb22f7fac837fa9ee6269393f3c513d0431d52672e7387625856c19299cfd584e1a3f39e0f98df13c99090df9f4d5cca8446776fe28c0ab6f1b372c1a6a246ae63f74f931e8365e15a089c68d61900000000000e216b0008050001692e1c390101009000000000000003e800000000000013880000004526945a00 12534 DEBUG gossipd: Previously-rejected announce for 514127x248x1 10761 DEBUG 02e01367e1d7818a7e9a0e8a52badd5c32615e07568dbe0497b6a47f9bef89d6af-channeld-chan#70770: Got it! 10761 DEBUG 02e01367e1d7818a7e9a0e8a52badd5c32615e07568dbe0497b6a47f9bef89d6af-channeld-chan#70770: ... , awaiting 1120 10761 DEBUG 02e01367e1d7818a7e9a0e8a52badd5c32615e07568dbe0497b6a47f9bef89d6af-channeld-chan#70770: Sending master 1020 ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-01-27 09:31:02 +10:30
Rusty Russell	241324aa09	gossipd: don't shortcut dying phase for local channels. This means that we won't complain to peers which gossip about our channels, but it does mean that our channel graph (like other nodes on the network) will show two channels, not one, for the duration. For this reason, we need askrene to omit local dying channels. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-01-08 22:33:19 +10:30
Rusty Russell	d6f6d46c3b	gossipd: move timestamp_reasonable into gossmap_manage.c. It's only used in there anyway. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-12-19 12:37:36 +01:00
Rusty Russell	4c8d656f77	gossipd: don't need hsm fd any more. gossipd no longer makes gossip messages, and hasn't since v24.02, so it doesn't actually need to talk to the hsm daemon. Also, various comments were out of date, so fix those too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-12-19 12:37:36 +01:00
Rusty Russell	f0e9547661	gossipd: make sure we correctly move node announcement when no channel preceeds it in the gossip store. We had the test backwards, so we moved it all the time. This bloats our gossip store, as well as not moving it in the case where we need to. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Fixed: gossipd: we would occasionally not show a node announcement in listnodes().	2025-12-17 11:56:42 +10:30
Rusty Russell	8b9020d7b9	global: use clock_time in place of time_now(). Except for tracing, that sticks with time_now(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-11-13 21:21:29 +10:30
Rusty Russell	522457a12b	connectd, gossipd, pay, bcli: use timemono when solely measuring duration for timeouts. This is immune to things like clock changes, and has the convenient side-effect that it will not be overridden when we override time for developer purposes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-11-13 21:21:29 +10:30
Rusty Russell	806dc89cad	gossipd: remove --dev-gossip-time setting, we'll use CLN_DEV_SET_TIME. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-11-13 21:21:29 +10:30
Rusty Russell	2086699b70	common: add randbytes() wrapper to override cryptographic entropy: $CLN_DEV_ENTROPY_SEED Only in developer mode, ofc. Notes: 1. We have to move the initialization before the lightningd main trace_start, since that uses pseudorand(). 2. To make the results stable, we need to use per-caller values to randbytes(). Otherwise external timing changes the call order. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-11-13 21:21:29 +10:30
Rusty Russell	75616f6b77	common: add new_htable() macro to allocate, initialize and setup memleak coverage for any typed hash table. You can now simply add per-tal-object helpers for memleak, but our older pattern required calling memleak functions explicitly during memleak handling. Hash tables in particular need to be dynamically allocated (we override the allocators using htable_set_allocator and assume this), so it makes sense to have a helper macro that does all three. This eliminates a huge amount of code. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-24 11:30:17 +10:30
Rusty Russell	6e5cb299dd	global: remove unnecessary includes from C files. Basically, `devtools/reduce-includes.sh /.c`. Build time from make clean (RUST=0) (includes building external libs): Before: real 0m38.944000-40.416000(40.1131+/-0.4)s user 3m6.790000-17.159000(15.0571+/-2.8)s sys 0m35.304000-37.336000(36.8942+/-0.57)s After: real 0m37.872000-39.974000(39.5466+/-0.59)s user 3m1.211000-14.968000(12.4556+/-3.9)s sys 0m35.008000-36.830000(36.4143+/-0.5)s Build time after touch config.vars (RUST=0): Before: real 0m19.831000-21.862000(21.5528+/-0.58)s user 2m15.361000-30.731000(28.4798+/-4.4)s sys 0m21.056000-22.339000(22.0346+/-0.35)s After: real 0m18.384000-21.307000(20.8605+/-0.92)s user 2m5.585000-26.843000(23.6017+/-6.7)s sys 0m19.650000-22.003000(21.4943+/-0.69)s Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-23 06:44:04 +10:30
Rusty Russell	f6a4e79420	global: remove unnecessary includes from headers. Each header should only include the other headers it needs to compile; `devtools/reduce-includes.sh /.h` does this. The C files then need additional includes if they don't compile. And remove the entirely useless wire/onion_wire.h, which only serves to include wire/onion_wiregen.h. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-23 06:44:04 +10:30
Rusty Russell	e120f87083	Makefile: create a library containing common, wire and bitcoin objects. This means we don't have to manually choose what to link against, which is much of the complexity of our Makefiles: the compiler will automatically use any object files it needs to link. We already do this for ccan as libccan.a, now we have libcommon.a. We don't link against it for everything, as some tests require their own versions. Notes: 1. I get rid of the weird plugins/test/Makefile2 (accidental commit?) 2. Many tests change due to update-mocks. 3. In some places I added the missing dependency on the Makefile itself, though most are in the next patch. Before: Total program size: 221366528 Total tests size: 364243856 After: Total program size: 190733656 Total tests size: 337880888 Build time from make clean (RUST=0) (includes building external libs): Before: real 0m38.227000-44.245000(41.8222+/-1.6)s user 3m2.105000-33.696000(23.1442+/-8.4)s sys 0m35.054000-42.269000(39.7231+/-2)s After: real 0m38.944000-40.416000(40.1131+/-0.4)s user 3m6.790000-17.159000(15.0571+/-2.8)s sys 0m35.304000-37.336000(36.8942+/-0.57)s Build time after touch config.vars (RUST=0): Before: real 0m18.928000-22.776000(21.5084+/-1.1)s user 2m8.613000-36.567000(27.7281+/-7.7)s sys 0m20.458000-23.436000(22.3963+/-0.77)s After: real 0m19.831000-21.862000(21.5528+/-0.58)s user 2m15.361000-30.731000(28.4798+/-4.4)s sys 0m21.056000-22.339000(22.0346+/-0.35)s Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> rusty@rusty-Framework:~/devel/cvs/lightni	2025-10-23 06:44:04 +10:30
Rusty Russell	75164d2c81	gossipd: save gossip store writes, try them again (and fsync) if we get a read issue. This is a last resort, but what else are we supposed to do when we wrote something and it didn't appear? In particular, ZFS doesn't just "fix itself": ``` remaining_fd=200001b0c9761dff0000000001009411e26cd56d68aabc285ee1c8ee43d59be6f939b0ce353d80213918680a7438356b9c5ea6bb001a6 bb37a4dea93776f4abc8cd371525b4d1605a74b89d7cb1bfc8865ddf22288c7ea08b9d98b34155b4aed159eb81732957e6bf79b996752bf2a9995aae ad1d65e7889e826ea0ba42f7746c176fe12f2fe6c04af1a74b4f0a262d20efd57133eb32693c789eb3f09caf4f4c6ecd2f734b3b36e751ffcc2748c5 8feabce4173c4ce6098a2c5397aabf1be5442cb67b5030be11ebd8b9841838dae127fe30000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000002000000a218b9d93000000001005000000000000c060 ``` Note the record appended on the end after all the zeroes. Changelog-Changed: gossipd: add gossip_store recovery for filesystems which do not synchronize read and write (e.g. ZFS on Linux), by disabling mmap reads and rewriting the last records. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-01 13:29:33 +09:30
Rusty Russell	5b3f3270f4	gossmap: use gossmap_disable_mmap() on corruption. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-01 13:29:33 +09:30
Rusty Russell	9fb8870f92	gossip: add COMPLETED bit to mark records which are complete. This should detect partial writes more robustly, since we make a separate pwrite() call to update this flag after the record is written. Previously we were playing a bit loose with synchronization assumptions, which seemed to work on Linux ext4, but not so well elsewhere. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-01 13:29:33 +09:30
Rusty Russell	f882b69dfe	gossipd: remove gossmap_fetch_tail. It only gets called for diagnostics when something goes wrong (and we were going to exit anyway), and it's only useful with mmap (which we now disable on error) but it shouldn't crash: ``` BROKEN gossipd: Truncated gossmap record @7991501/7991523 (len 0): waiting BROKEN gossipd: FATAL SIGNAL 6 (version v25.09) BROKEN gossipd: backtrace: common/daemon.c:41 (send_backtrace) 0x6506817cc529 BROKEN gossipd: backtrace: common/daemon.c:78 (crashdump) 0x6506817cc578 BROKEN gossipd: backtrace: ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 ((null)) 0x75e8267a032f BROKEN gossipd: backtrace: ./nptl/pthread_kill.c:44 (__pthread_kill_implementation) 0x75e8267f9b2c BROKEN gossipd: backtrace: ./nptl/pthread_kill.c:78 (__pthread_kill_internal) 0x75e8267f9b2c BROKEN gossipd: backtrace: ./nptl/pthread_kill.c:89 (__GI___pthread_kill) 0x75e8267f9b2c BROKEN gossipd: backtrace: ../sysdeps/posix/raise.c:26 (__GI_raise) 0x75e8267a027d BROKEN gossipd: backtrace: ./stdlib/abort.c:79 (__GI_abort) 0x75e8267838fe BROKEN gossipd: backtrace: ./assert/assert.c:96 (__assert_fail_base) 0x75e82678381a BROKEN gossipd: backtrace: ./assert/assert.c:105 (__assert_fail) 0x75e826796516 BROKEN gossipd: backtrace: common/gossmap.c:111 (map_copy) 0x6506817cea77 BROKEN gossipd: backtrace: common/gossmap.c:1870 (gossmap_fetch_tail) 0x6506817d1f93 BROKEN gossipd: backtrace: gossipd/gossmap_manage.c:1442 (gossmap_manage_get_gossmap) 0x6506817c45fb BROKEN gossipd: backtrace: gossipd/gossmap_manage.c:753 (gossmap_manage_handle_get_txout_reply) 0x6506817c5850 BROKEN gossipd: backtrace: gossipd/gossipd.c:574 (recv_req) 0x6506817c172b ``` Reported-by: @grubles Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-01 13:29:33 +09:30
Dusty Daemon	07f4bc39b1	splice: Add `start_batch` and an internal wire type We add `start_batch` to match t-bast’s splicing spec and we add a new internal wire type `WIRE_PROTOCOL_BATCH_ELEMENT` using the type number 0 Changelog-Added: support for `start_batch`	2025-08-14 16:40:04 +09:30
Matt Whitlock	3f6cd59dc9	gossipd: check for existing channel announcement before sigcheck Checking a signature is a CPU-intensive operation that should be performed only if gossmap doesn't already have the channel announcement in question and we're not already checking for the announcement's UTxO. Changelog-Fixed: `gossipd` doesn't waste CPU cycles checking signatures on channel announcements that are already known Issue: https://github.com/ElementsProject/lightning/issues/7972	2025-06-10 16:40:33 -05:00
Rusty Russell	6b77c95b95	gossipd: don't spam the log on duplicate channel_update. This message was too verbose (even for trace!) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-05-15 16:40:33 +09:30
Rusty Russell	9a967d6770	gossipd: don't try to connect to ourselves if we need more peers. Reported-by: JssDWt Closes: https://github.com/ElementsProject/lightning/issues/8200 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-None: trivial	2025-05-08 15:30:14 -05:00
Rusty Russell	fff054f8ee	gossipd: fix false memleak positive in gossmap_manage If there are pending channel announcements, they'll look like a leak unless we scan into the maps. ``` lightningd-2 2025-05-01T07:27:03.922Z BROKEN gossipd: MEMLEAK: 0x60d000000478 lightningd-2 2025-05-01T07:27:03.923Z BROKEN gossipd: label=gossipd/gossmap_manage.c:595:struct pending_cannounce lightningd-2 2025-05-01T07:27:03.923Z BROKEN gossipd: alloc: lightningd-2 2025-05-01T07:27:03.923Z BROKEN gossipd: /home/runner/work/lightning/lightning/ccan/ccan/tal/tal.c:488 (tal_alloc_) lightningd-2 2025-05-01T07:27:03.923Z BROKEN gossipd: /home/runner/work/lightning/lightning/gossipd/gossmap_manage.c:595 (gossmap_manage_channel_announcement) lightningd-2 2025-05-01T07:27:03.923Z BROKEN gossipd: /home/runner/work/lightning/lightning/gossipd/gossipd.c:205 (handle_recv_gossip) lightningd-2 2025-05-01T07:27:03.923Z BROKEN gossipd: /home/runner/work/lightning/lightning/gossipd/gossipd.c:300 (connectd_req) lightningd-2 2025-05-01T07:27:03.923Z BROKEN gossipd: /home/runner/work/lightning/lightning/common/daemon_conn.c:35 (handle_read) lightningd-2 2025-05-01T07:27:03.923Z BROKEN gossipd: /home/runner/work/lightning/lightning/ccan/ccan/io/io.c:60 (next_plan) lightningd-2 2025-05-01T07:27:03.923Z BROKEN gossipd: /home/runner/work/lightning/lightning/ccan/ccan/io/io.c:422 (do_plan) lightningd-2 2025-05-01T07:27:03.923Z BROKEN gossipd: /home/runner/work/lightning/lightning/ccan/ccan/io/io.c:439 (io_ready) lightningd-2 2025-05-01T07:27:03.923Z BROKEN gossipd: /home/runner/work/lightning/lightning/ccan/ccan/io/poll.c:455 (io_loop) lightningd-2 2025-05-01T07:27:03.923Z BROKEN gossipd: /home/runner/work/lightning/lightning/gossipd/gossipd.c:660 (main) lightningd-2 2025-05-01T07:27:03.923Z BROKEN gossipd: ../sysdeps/nptl/libc_start_call_main.h:58 (__libc_start_call_main) lightningd-2 2025-05-01T07:27:03.924Z BROKEN gossipd: ../csu/libc-start.c:392 (__libc_start_main_impl) lightningd-2 2025-05-01T07:27:03.924Z BROKEN gossipd: parents: lightningd-2 2025-05-01T07:27:03.924Z BROKEN gossipd: gossipd/gossmap_manage.c:475:struct gossmap_manage ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-05-08 14:01:38 +09:30
Rusty Russell	733efcf7dd	BOLTs: import spec additions for option_simple_close. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-03-18 14:30:58 +10:30
Rusty Russell	67c91a7e5c	BOLTs: Update to version with peer storage merged. Unfortunately a spec typo means the data fields are missing (PR pending), so we still patch those in. The message "your_peer_storage" got renamed to "peer_storage_retrieval", and the option "want_peer_backup_storage" was removed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-EXPERIMENTAL: `experimental-peer-storage` now only advertizes feature 43, not 41.	2025-03-18 14:30:58 +10:30
Alex Myers	1e5a577fb2	gossipd: fix typo Changelog-None	2025-03-03 12:25:26 -06:00
Alex Myers	4f7df828b4	gossipd, chanbackup: reduce logging levels The vast majority of incoming channel updates seem to be cut due to age, which results in noisy logs. Similarly, the chanbackup logging verbosity might better match the equivalent actions in channeld, which are at the debug level. Fixes: #8058 Changelog-None: introduced in 25.02	2025-02-26 14:15:13 +10:30
Rusty Russell	d554206d7e	gossipd: fix bogus message when dying channel is pruned. ``` 2025-01-23T12:31:52.528Z DEBUG gossipd: Pruning channel 839050x1246x0 from network view (ages 1736283379 and 1737600120) 2025-01-27T00:32:01.631Z DEBUG gossipd: Pruning channel 839050x1246x0 from network view (ages 0 and 1737686520) 2025-01-27T00:50:05.998Z BROKEN gossipd: Dying channel 839050x1246x0 already deleted? ``` Easiest not to prune in this case. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	2b4b1479ed	gossipd: check that gossmap code sees updates from gossip_store writes. After analyzing various weird cases where we ended up with duplicate gossip_store entries, it could be explained by us not fully processing the gossip store. It's not clear that my assumptions that we would always see our own writes are true: technically this may require an fsync(). So we now add the check, and do an fsync and try again. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Fixed: gossipd: more sanity checks that we are correctly updating the gossip_store file.	2025-02-11 15:11:47 -06:00
Rusty Russell	8156c83e11	gossipd: check that we are always appending. We had at least one report of overwriting the gossip_store file at offset 1. Make sure this doesn't happen. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	769ccaa4c3	gossipd: correctly process dying channels. Found by inspection. Minor bug, since we'll catch it on the next block, but annoying. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	1df1300cc9	gossip_store: don't need to check for truncated amounts. That's actually caught by the gossmap load now. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	9d98740e18	gossmap: stricter checks when gossipd itself loads the gossip_store. This means we will correctly reset the store if it has redundant records, for example. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	4f2a7039c6	gossipd: put gossip_store pointer inside gossmap_manage. It's actually the only one that uses it. We also tweak the way gossip_store handles failure: gossmap_manage now tells it when to reset the corrupted store. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	fdfc7ce62f	gossmap: add (and use) logging hook. Default goes to stderr for LOG_UNUSUAL and higher. We have to whitelist more cases in map_catchup so we don't spam the logs with perfectly-expected (but ignored) messages though. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	607b14fe12	common/gossmap: remove open-by-fd. We only use it in one place, and that was simply to share an fd between gossipd writing and gossipd reading, which may be causing our zfs problem anyway. In fact, it fixes a race if we don't have HAVE_PWRITEV. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	01650ebcd7	gossipd: make sure we never write bad entries. We have reports of crashes on reading gossip_store, including from gossipd itself! ``` lightning_gossipd: common/gossmap.c:121: map_copy: Assertion `offset + len <= map->map_size' failed. ... lightning_gossipd: FATAL SIGNAL (version v24.11) 0x6260c41d682a send_backtrace common/daemon.c:33 0x6260c41e098b status_failed common/status.c:221 0x6260c41e0b41 status_backtrace_exit common/subdaemon.c:18 0x6260c41d68b8 crashdump common/daemon.c:78 0x70508ea6913f ??? ???:0 0x70508e8a0d51 ??? ???:0 0x70508e88a536 ??? ???:0 0x70508e88a40e ??? ???:0 0x70508e8996d1 ??? ???:0 0x6260c41d8b69 map_copy common/gossmap.c:121 0x6260c41d8bab map_be16 common/gossmap.c:142 0x6260c41daa45 map_catchup common/gossmap.c:705 0x6260c41dab95 gossmap_refresh_mayfail common/gossmap.c:1192 0x6260c41daca6 gossmap_refresh common/gossmap.c:1213 0x6260c41cee32 gossmap_manage_get_gossmap gossipd/gossmap_manage.c:1314 0x6260c41d0686 gossmap_manage_new_block gossipd/gossmap_manage.c:1221 0x6260c41cbfdd new_blockheight gossipd/gossipd.c:473 0x6260c41cc363 recv_req gossipd/gossipd.c:584 0x6260c41d6b1d handle_read common/daemon_conn.c:35 0x6260c43175b5 next_plan ccan/ccan/io/io.c:60 0x6260c4317a40 do_plan ccan/ccan/io/io.c:422 0x6260c4317af9 io_ready ccan/ccan/io/io.c:439 0x6260c4319446 io_loop ccan/ccan/io/poll.c:455 0x6260c41cccf4 main gossipd/gossipd.c:665 ``` This implies that we have a message shorter than 2 bytes, which should never happen. An audit didn't shed any light, but let's make sure we don't ever write such a thing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00

1 2 3 4 5 ...

1226 Commits