palladum-lightning

Author	SHA1	Message	Date
Rusty Russell	15696d97bd	gossipd: code to invoke compactd and reopen store. This isn't called anywhere yet. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	1fb4da075f	gossipd: put the last_writes array inside struct gossip_store. This is the file responsible for all the writing, so it should be responsible for the rewriting if necessary (rather than gossmap_manage). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	445bcd040a	gossipd: don't compact on startup. We now only need to walk it if we're doing an upgrade. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Changed: `gossipd` no longer compacts gossip_store on startup (improving start times significantly).	2026-02-16 17:23:33 +10:30
Rusty Russell	dfc4ce21de	gossipd: don't gather dying channels during compaction. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	e8fd235d4e	common: move gossip_store_wire.csv into common/ from gossipd/ It's used by common/gossip_store.c, which is used by many things other than gossipd. This file belongs in common. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	5dcf39867c	gossipd: write uuid record on startup. This is the first record, and ignored by everything else. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	b1055aa0ac	gossip_store: add UUID entry at front of the store. We also put this in the store_ended message, too: so you can tell if the equivalent_offset there really refers to this new entry (or if two or more rewrites have happened). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	ae957161e6	pytest: fix bogus test_gossip_store_compact_noappend test. It didn't do anything, since the dev_compact_gossip_store command was removed. When we make it do something, it crashes since old_len is 0: ``` gossipd: gossip_store_compact: bad version gossipd: FATAL SIGNAL 6 (version v25.12rc3-1-g9e6c715-modded) ... gossipd: backtrace: ./stdlib/abort.c:79 (__GI_abort) 0x7119bd8288fe gossipd: backtrace: ./assert/assert.c:96 (__assert_fail_base) 0x7119bd82881a gossipd: backtrace: ./assert/assert.c:105 (__assert_fail) 0x7119bd83b516 gossipd: backtrace: gossipd/gossip_store.c:52 (append_msg) 0x56294de240eb gossipd: backtrace: gossipd/gossip_store.c:358 (gossip_store_compact) 0x56294 gossipd: backtrace: gossipd/gossip_store.c:395 (gossip_store_new) 0x56294de24 gossipd: backtrace: gossipd/gossmap_manage.c:455 (setup_gossmap) 0x56294de255 gossipd: backtrace: gossipd/gossmap_manage.c:488 (gossmap_manage_new) 0x56294 gossipd: backtrace: gossipd/gossipd.c:400 (gossip_init) 0x56294de22de9 ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-16 17:23:33 +10:30
Rusty Russell	8b9020d7b9	global: use clock_time in place of time_now(). Except for tracing, that sticks with time_now(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-11-13 21:21:29 +10:30
Rusty Russell	522457a12b	connectd, gossipd, pay, bcli: use timemono when solely measuring duration for timeouts. This is immune to things like clock changes, and has the convenient side-effect that it will not be overridden when we override time for developer purposes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-11-13 21:21:29 +10:30
Rusty Russell	6e5cb299dd	global: remove unnecessary includes from C files. Basically, `devtools/reduce-includes.sh /.c`. Build time from make clean (RUST=0) (includes building external libs): Before: real 0m38.944000-40.416000(40.1131+/-0.4)s user 3m6.790000-17.159000(15.0571+/-2.8)s sys 0m35.304000-37.336000(36.8942+/-0.57)s After: real 0m37.872000-39.974000(39.5466+/-0.59)s user 3m1.211000-14.968000(12.4556+/-3.9)s sys 0m35.008000-36.830000(36.4143+/-0.5)s Build time after touch config.vars (RUST=0): Before: real 0m19.831000-21.862000(21.5528+/-0.58)s user 2m15.361000-30.731000(28.4798+/-4.4)s sys 0m21.056000-22.339000(22.0346+/-0.35)s After: real 0m18.384000-21.307000(20.8605+/-0.92)s user 2m5.585000-26.843000(23.6017+/-6.7)s sys 0m19.650000-22.003000(21.4943+/-0.69)s Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-23 06:44:04 +10:30
Rusty Russell	f6a4e79420	global: remove unnecessary includes from headers. Each header should only include the other headers it needs to compile; `devtools/reduce-includes.sh /.h` does this. The C files then need additional includes if they don't compile. And remove the entirely useless wire/onion_wire.h, which only serves to include wire/onion_wiregen.h. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-23 06:44:04 +10:30
Rusty Russell	75164d2c81	gossipd: save gossip store writes, try them again (and fsync) if we get a read issue. This is a last resort, but what else are we supposed to do when we wrote something and it didn't appear? In particular, ZFS doesn't just "fix itself": ``` remaining_fd=200001b0c9761dff0000000001009411e26cd56d68aabc285ee1c8ee43d59be6f939b0ce353d80213918680a7438356b9c5ea6bb001a6 bb37a4dea93776f4abc8cd371525b4d1605a74b89d7cb1bfc8865ddf22288c7ea08b9d98b34155b4aed159eb81732957e6bf79b996752bf2a9995aae ad1d65e7889e826ea0ba42f7746c176fe12f2fe6c04af1a74b4f0a262d20efd57133eb32693c789eb3f09caf4f4c6ecd2f734b3b36e751ffcc2748c5 8feabce4173c4ce6098a2c5397aabf1be5442cb67b5030be11ebd8b9841838dae127fe30000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000002000000a218b9d93000000001005000000000000c060 ``` Note the record appended on the end after all the zeroes. Changelog-Changed: gossipd: add gossip_store recovery for filesystems which do not synchronize read and write (e.g. ZFS on Linux), by disabling mmap reads and rewriting the last records. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-01 13:29:33 +09:30
Rusty Russell	9fb8870f92	gossip: add COMPLETED bit to mark records which are complete. This should detect partial writes more robustly, since we make a separate pwrite() call to update this flag after the record is written. Previously we were playing a bit loose with synchronization assumptions, which seemed to work on Linux ext4, but not so well elsewhere. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-01 13:29:33 +09:30
Rusty Russell	2b4b1479ed	gossipd: check that gossmap code sees updates from gossip_store writes. After analyzing various weird cases where we ended up with duplicate gossip_store entries, it could be explained by us not fully processing the gossip store. It's not clear that my assumptions that we would always see our own writes are true: technically this may require an fsync(). So we now add the check, and do an fsync and try again. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Fixed: gossipd: more sanity checks that we are correctly updating the gossip_store file.	2025-02-11 15:11:47 -06:00
Rusty Russell	8156c83e11	gossipd: check that we are always appending. We had at least one report of overwriting the gossip_store file at offset 1. Make sure this doesn't happen. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	1df1300cc9	gossip_store: don't need to check for truncated amounts. That's actually caught by the gossmap load now. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	4f2a7039c6	gossipd: put gossip_store pointer inside gossmap_manage. It's actually the only one that uses it. We also tweak the way gossip_store handles failure: gossmap_manage now tells it when to reset the corrupted store. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	607b14fe12	common/gossmap: remove open-by-fd. We only use it in one place, and that was simply to share an fd between gossipd writing and gossipd reading, which may be causing our zfs problem anyway. In fact, it fixes a race if we don't have HAVE_PWRITEV. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	01650ebcd7	gossipd: make sure we never write bad entries. We have reports of crashes on reading gossip_store, including from gossipd itself! ``` lightning_gossipd: common/gossmap.c:121: map_copy: Assertion `offset + len <= map->map_size' failed. ... lightning_gossipd: FATAL SIGNAL (version v24.11) 0x6260c41d682a send_backtrace common/daemon.c:33 0x6260c41e098b status_failed common/status.c:221 0x6260c41e0b41 status_backtrace_exit common/subdaemon.c:18 0x6260c41d68b8 crashdump common/daemon.c:78 0x70508ea6913f ??? ???:0 0x70508e8a0d51 ??? ???:0 0x70508e88a536 ??? ???:0 0x70508e88a40e ??? ???:0 0x70508e8996d1 ??? ???:0 0x6260c41d8b69 map_copy common/gossmap.c:121 0x6260c41d8bab map_be16 common/gossmap.c:142 0x6260c41daa45 map_catchup common/gossmap.c:705 0x6260c41dab95 gossmap_refresh_mayfail common/gossmap.c:1192 0x6260c41daca6 gossmap_refresh common/gossmap.c:1213 0x6260c41cee32 gossmap_manage_get_gossmap gossipd/gossmap_manage.c:1314 0x6260c41d0686 gossmap_manage_new_block gossipd/gossmap_manage.c:1221 0x6260c41cbfdd new_blockheight gossipd/gossipd.c:473 0x6260c41cc363 recv_req gossipd/gossipd.c:584 0x6260c41d6b1d handle_read common/daemon_conn.c:35 0x6260c43175b5 next_plan ccan/ccan/io/io.c:60 0x6260c4317a40 do_plan ccan/ccan/io/io.c:422 0x6260c4317af9 io_ready ccan/ccan/io/io.c:439 0x6260c4319446 io_loop ccan/ccan/io/poll.c:455 0x6260c41cccf4 main gossipd/gossipd.c:665 ``` This implies that we have a message shorter than 2 bytes, which should never happen. An audit didn't shed any light, but let's make sure we don't ever write such a thing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-02-11 15:11:47 -06:00
Rusty Russell	ebf784ef9c	gossipd: use u64 for the one offset we don't. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-10-08 09:50:17 +02:00
Rusty Russell	744116e501	gossipd: make extra-sure we don't put in redundant channel_announcement messages. We only write these in two places: one where we get a message from lightningd about our own channel, and one where we get a reply from lightningd about a txout check. The former case we explicitly check that we don't already have it in gossmap, so add checks to the latter case, and give verbose detail if it's found. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-05-23 20:23:36 +02:00
Rusty Russell	1350194698	gossipd: don't assert on redundant flag write, just log message. This happens to Vincenzo, and I think it's due to previous gossip_store issues. Fixes: https://github.com/ElementsProject/lightning/issues/7051 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-02-12 18:05:06 +01:00
Rusty Russell	ca5b7b00b6	gossipd: clean up flags accessors. We want to be able to clear them, and fetch them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-02-12 11:43:33 +01:00
Rusty Russell	6a02cfccd7	gossipd: simplify gossip store API. Instead of "new" and "load", we don't really need to "load" anything, so do everything in gossip_store_new. Have it do the compaction/rewrite, and collect the dying records	2024-02-04 09:24:44 +10:30
Rusty Russell	c49fb2edd5	gossipd: clean up gossip_store offsets. gossmap offsets are to the beginning of the message, whereas the gossip_store uses the header offset. Convert the internals of gossip_store to use gossmap-style uniformly, even where it's a little less convenient. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-02-04 09:24:44 +10:30
Rusty Russell	7efa0a46a9	gossipd: clean up gossip_store routines. We don't use the dying flag, and we can manually append the addendum rather than having gossip_store_add present a bizarre interface. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-02-04 09:24:44 +10:30
Rusty Russell	f7b7cf3719	gossipd: remove routing.c and other unused functions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-02-04 09:24:44 +10:30
Rusty Russell	c286241ab3	gossipd: switch over to using gossmap_manage, not routing.c. The gossip_store_load is now basically a noop, since gossmap does that. gossipd removes a pile of routines dealing with messages, in favor of just handing them to gossmap_manage. The stub gossmap_manage constructor is removed entirely. We simplified behaviour around channel_announcements with no channel update: we now add them to the store, and go back to fix the timestamp later. This changes a test, which explicitly tests for the old behaviour. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-02-04 09:24:44 +10:30
Rusty Russell	1384d56db3	gossmap_manage: new file for managing the gossip store. This is a fair amount of code, but much is taken from the old routing.c, with the difference that this uses common/gossmap instead of our own structures. The interfaces are fairly clear: 1. gossmap_manage_new - allocator 2. gossmap_manage_channel_announcement - handle new channel announcement msg - if too early, keeps it in early map - queues it, asks lightingd about UTXO. 3. gossmap_manage_handle_get_txout_reply - handle response from lightningd for above. 4. gossmap_manage_channel_update - handle channel_update message - may have to wait on pending channel_announcement 5. gossmap_manage_node_announcement - handle node_announcement msg - may have to wait on pending channel_announcement 6. gossmap_manage_new_block - see if early announces can now be processed. 7. gossmap_manage_channel_spent - lightningd tells us UTXO is spent - may prepare channel for closing in 12 blocks. 8. gossmap_manage_channel_dying - gossip_store load tells us channel was spent earlier. - like gossmap_manage_channel_spent, but maybe < 12. 9. gossmap_manage_get_gossmap - gossmap accessor: seeker and queries will need this. 10. gossmap_manage_new_peer - a new peer has connected, give them all our gossip. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-02-04 09:24:44 +10:30
Rusty Russell	07cd4a809b	gossipd: remove spam handling. We weakened this progressively over time, and gossip v1.5 makes spam impossible by protocol, so we can wait until then. Removing this code simplifies things a great deal! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Removed: Protocol: we no longer ratelimit gossip messages by channel, making our code far simpler.	2024-02-04 09:24:44 +10:30
Rusty Russell	e7ceffd565	gossipd: remove zombie handling. We never enabled it, because we seemed to be eliminating valid channels. We discard zombie-marked records on loading. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-02-04 09:24:44 +10:30
Rusty Russell	8cfbc1e7ad	gossipd: make gossip_store hold daemon ptr, not rstate. Makes it easier to wean off routing.c. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-02-04 09:24:44 +10:30
Rusty Russell	7f5fe52320	gossipd: remove online gossip_store compaction. It was an obscure dev command, as it never worked reliably. It would be much easier to re-implement once this is done. This turned out to reveal a tiny leak on tests/test_gossip.py::test_gossip_store_load_amount_truncated where we didn't immedately free chan_ann if it was dangling. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-02-04 09:24:44 +10:30
Rusty Russell	561859da0c	gossipd: move tell_lightningd_peer_update from routing.c into gossipd.c Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-02-04 09:24:44 +10:30
Rusty Russell	870c996628	gossipd: new routines to support gossmap compatibility. gossip_store_del - takes a gossmap-style offset-of-msg not offset-of-hdr. gossip_store_flag: set an arbitrary flag on a gossip_store hdr. gossip_store_get_timestamp/gossip_store_set_timestamp: access gossip_store hdr. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-02-04 09:24:44 +10:30
Rusty Russell	2d15745f9e	gossipd: don't put private channel info into store at all. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-01-31 14:47:33 +10:30
Rusty Russell	8db7adc76f	gossipd: no longer take private channel updates from lightningd Lightningd now handles private channels, so we're dismantling the gossipd infrastructure. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-01-31 14:47:33 +10:30
Rusty Russell	f2f43eeffa	gossipd: strip private updates from gossip_store on startup. We rename them to _obs, too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-01-31 14:47:33 +10:30
Rusty Russell	1fc603ea6e	gossipd: remove #if DEVELOPER in favor of runtime flag. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2023-09-21 20:08:24 +09:30
Rusty Russell	846cec4f2a	gossipd: ignore redundant node_announcement in gossip_store. Don't know how this is happening, but it is not harmful to ignore it for now. Fixes: #6531 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2023-08-11 12:38:07 +09:30
Alex Myers	4a4da00d28	gossipd: handle upgrade from version 11 gossip_store	2023-07-27 06:41:44 +09:30
Rusty Russell	af7e641445	gossipd: don't "unmark" dying channels' updates if we receive them. This looked like a test flake, but was real: ``` l1.daemon.wait_for_log("closing soon due to the funding outpoint being spent") # We won't gossip the dead channel any more (but we still propagate node_announcement). But connectd is not explicitly synced, so wait for "a bit". time.sleep(1) > assert len(get_gossip(l1)) == 2 E assert 4 == 2 ``` We can see that two channel_updates come in after we mark it dying: ``` gossipd: channel 103x1x0 closing soon due to the funding outpoint being spent gossipd: REPLY WIRE_GOSSIPD_NEW_BLOCKHEIGHT_REPLY with 0 fds 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Received channel_update for channel 103x1x0/0 now DISABLED 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-gossipd: Received channel_update for channel 103x1x0/1 now DISABLED ``` We should keep marking channel_updates the same way. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2023-07-25 10:11:05 +09:30
Rusty Russell	d16797ce58	gossipd: add dying marker to channel_announcement/channel_update. We don't actually delete them for 12 blocks, but we can't avoid propagating them. We don't mark node_announcements, which is a bit weird, but avoids us tracking logic to un-dying them if a channel is opened. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2023-07-20 11:47:32 +09:30
Rusty Russell	01670d5e5e	gossipd: use htable, not linked list for peers. This speeds up nodeid lookups, which is useful for the next simplification. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2023-07-09 16:49:48 +09:30
Alex Myers	40a50c1ada	gossipd: don't fail on gossip deletion Reported in #6270, there was an attempt to delete gossip overrunning the end of the gossip_store. This logs the gossip type that was attempted to be deleted and avoids an immediate crash (tombstones would be fine to skip over at least.) Changelog-None	2023-06-02 12:02:39 +09:30
Alex Myers	54bd024910	gossip_store: remove now-redundant push bit The push bit was convenient for connectd to send our own gossip to peers upon connecting by naively traversing the gossip_store and sending anything flagged `push`. This function is now performed by gossipd leaving no use for the push bit. Changelog-Changed: `gossipd`: gossip_store PUSH bit is no longer set.	2023-04-13 08:48:50 -07:00
Rusty Russell	aaa14846c6	gossipd: ignore zombie flag when loading gossip_store. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2023-03-06 16:15:22 -06:00
Alex Myers	d1402e06f9	gossipd: load and store node_announcements correctly Loading the gossip_store would not create a pending node announcement when the node already had a zombie channel. This would cause the node announcement to attempt to be loaded, but fail because it had no broadcastable channels. Accepting a pending node announcement as when normally loading from the channel corrects this. `node_has_public_channels` taking into account zombie channels enables this behavior. Separately, node_announcements were still being flagged as zombies in the gossip store despite that feature being removed. Changelog-None	2023-03-01 15:36:13 -06:00
Alex Myers	d5246e43bb	gossipd: flag zombie channels when loading from gossip_store Without inheriting zombie status, gossipd would allow regular channel updates into the store until the pruning cycle hits (and the channel is properly flagged) which is 3.5 days. Applying zombie status when reading channel updates from the store prevents this. Changelog-None	2023-03-01 15:36:13 -06:00

1 2 3 4

185 Commits