palladum-lightning

Author	SHA1	Message	Date
Rusty Russell	963b353a30	connectd: use membuf for more efficient output queue. This is exactly what membuf is for: it handles expansion much more neatly. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-18 14:13:25 +10:30
Rusty Russell	afdc92fedf	connectd: only do lazy transmission for definitely non-urgent messages. Since we delay the others quite a lot (up to 1 second), it's better to consider most messages "urgent" and worth immediately transmitting. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-18 14:13:25 +10:30
Rusty Russell	2436ee6f6f	connectd: don't flush messages unless we have something important. This replaces our previous nagle-based toggling. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-18 14:13:25 +10:30
Rusty Russell	8b90d40a75	connectd: pad messages with dummy pings if needed to make size uniform. Messages are now constant. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Added: Protocol: we now pad all peer messages to make them the same length.	2026-02-18 14:13:25 +10:30
Rusty Russell	d45bc2d56e	connectd: don't toggle nagle on and off, leave it always off. We're doing our own buffering now. We leave the is_urgent() function for two commits in the future though. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-18 14:13:25 +10:30
Rusty Russell	c23b7a492d	connect: switch to using io_write_partial instead of io_write. This gives us finer control over write sizes: for now we just cap the write size. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-18 14:13:25 +10:30
Rusty Russell	df1ae1d680	connectd: refactor to break up "encrypt_and_send". Do all the special treatment of the message type first. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-18 14:13:25 +10:30
Rusty Russell	7577e59f6c	connectd: refactor outgoing loop. Give us a single "next message" function to call. This will be useful when we want to write more than one at a time. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-18 14:13:25 +10:30
Rusty Russell	63497b3180	pytest: fix flake in test_even_sendcustommsg We need to make sure the message is fully processed before removing the plugin. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-03 16:12:04 +10:30
Rusty Russell	18a416be0d	connectd: unify IO logging calls. Normally, connectd forwards messages and then the subds do logging, but it logs manually for msgs which are handled internally. Clarify this logic in one place for all callers. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2026-02-03 16:12:04 +10:30
Rusty Russell	acc41ddc0c	connectd: don't log at INFO level for known issue. We get spammed by this, because we somehow missed occasional channel closes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-11-27 15:04:20 +10:30
Rusty Russell	79e609468a	connectd: don't complain if lightningd is unresponsive while doing dev-memleak. We had a flake of form: ``` 2025-11-18T04:42:23.489Z BROKEN 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-connectd: wake delay for WIRE_CHANNEL_REESTABLISH: 6789msec ``` Which happened as we're shutting down. Some investigation revealed the cause: `dev-memleak` can be extremely slow. Fair enough. So we change `dev-memleak` to call connectd first, and connectd uses that as a trigger to stop complaining about delays. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-11-19 14:29:08 +10:30
Rusty Russell	522457a12b	connectd, gossipd, pay, bcli: use timemono when solely measuring duration for timeouts. This is immune to things like clock changes, and has the convenient side-effect that it will not be overridden when we override time for developer purposes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-11-13 21:21:29 +10:30
Rusty Russell	565f7deec0	connectd: at disconnected, tell lightningd how long we were connected. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-11-12 13:58:43 +10:30
Rusty Russell	88b9b0bc28	connectd: report ping latencies (from ping probes) to lightningd. (Uninitialize ping_start on manual ping fixed by Alex Myers) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-11-12 13:58:43 +10:30
Rusty Russell	6e5cb299dd	global: remove unnecessary includes from C files. Basically, `devtools/reduce-includes.sh /.c`. Build time from make clean (RUST=0) (includes building external libs): Before: real 0m38.944000-40.416000(40.1131+/-0.4)s user 3m6.790000-17.159000(15.0571+/-2.8)s sys 0m35.304000-37.336000(36.8942+/-0.57)s After: real 0m37.872000-39.974000(39.5466+/-0.59)s user 3m1.211000-14.968000(12.4556+/-3.9)s sys 0m35.008000-36.830000(36.4143+/-0.5)s Build time after touch config.vars (RUST=0): Before: real 0m19.831000-21.862000(21.5528+/-0.58)s user 2m15.361000-30.731000(28.4798+/-4.4)s sys 0m21.056000-22.339000(22.0346+/-0.35)s After: real 0m18.384000-21.307000(20.8605+/-0.92)s user 2m5.585000-26.843000(23.6017+/-6.7)s sys 0m19.650000-22.003000(21.4943+/-0.69)s Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-23 06:44:04 +10:30
Rusty Russell	f6a4e79420	global: remove unnecessary includes from headers. Each header should only include the other headers it needs to compile; `devtools/reduce-includes.sh /.h` does this. The C files then need additional includes if they don't compile. And remove the entirely useless wire/onion_wire.h, which only serves to include wire/onion_wiregen.h. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-23 06:44:04 +10:30
Rusty Russell	fc6bee8950	connectd: close connection properly after dev-disconnect `+`. We need to drain subds too, otherwise if timing is correct we never close. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-14 09:48:02 +10:30
Rusty Russell	694626f050	connectd: fix race where last msg can still get lost. openingd sends an ERROR, and exits. lightningd tells us to disconnect. We read from lightningd first, and don't read from openingd. We need to drain subds when we're told to disconnect.	2025-10-01 12:12:56 +09:30
Rusty Russell	0d97631075	connectd: simplify logic, and add a "reconnected" message. One issue we have in CI is reconnection races: if an incoming connection arrives while an outgoing one is negotiated, we close the outgoing one and issue a disconnect, which fails any connect attempts. By sending a "reconnected" message instead of disconnect/connect we can avoid disturbing in-progress connection attempts which happens in CI quite a bit. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-10-01 12:12:56 +09:30
Rusty Russell	7fed03b910	pytest: fix broken message in test_even_sendcustommsg. We can stop listening on the incoming peer while we are closing, so we don't notice if they close: ``` ['lightningd-2 2025-09-03T09:48:19.555Z BROKEN 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Peer did not close, forcing close', 'lightningd-2 2025-09-03T09:48:22.918Z BROKEN 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Peer did not close, forcing close'] =========================== short test summary info ============================ ERROR tests/test_misc.py::test_even_sendcustommsg - ValueError: ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-09-30 11:37:31 +09:30
Matt Whitlock	7e85f924a4	connectd: demote "Peer did not close, forcing close" to UNUSUAL This message is logged when connectd tries to shut down a peer connection but the transmit buffer remains full for too long, maybe because the peer has crashed or has lost connectivity. Logging this message at the BROKEN level is inappropriate because BROKEN is intended to flag logic errors that imply incorrect code in CLN. The error in question here is actually a runtime error, which does not imply incorrect code (at least on our side), so demote the log message to the UNUSUAL level. (Even this is still probably too severe, as this message is logged rather more frequently than "unusual" would suggest.) Changelog-None Closes: https://github.com/ElementsProject/lightning/issues/5678	2025-09-30 11:37:31 +09:30
Rusty Russell	6a05f240d3	connectd: fix diagnostics if we get a long delay. In `a0fd72eb5e` I added a diagnostic message if messages cause large delays, but I didn't set the "peer_in_lasttime" variable in the case of locally-handled packets. I really want this in the release: the point of this was to try to diagnose some high-latency ping issues we've seen on the real network. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-09-02 16:02:03 +09:30
Rusty Russell	0456bace5d	connectd: drop excess gossipd messages. We haven't seen the "excessive queue length" backtrace since we fixed gossipd, so it's safe to drop excess messages without worrying about losing gossip. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-08-18 10:01:07 +09:30
Rusty Russell	5f5440383d	lightningd: fix race with crossover pings. We cannot use subd_req() here: replies will come out of order, and the we should not simply assign the reponses in FIFO order. Changelog-Fixed: lightningd: don't get confused with parallel ping commands. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-08-14 17:35:39 +09:30
Rusty Russell	a0fd72eb5e	connectd: warn if we ignore peer incoming for longer than 5 seconds. One reason why ping processing could be slow is that, once we receive a message from the peer to send to a subdaemon, we don't listen for others until we've drained that subdaemon queue entirely. This can happens for reestablish: slow machines can take a while to set that subdaemon up. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-08-14 17:35:39 +09:30
Dusty Daemon	052f36cf2e	connectd: Implement sending of `start_batch` Implement the sending of `start_batch` and `protocol_batch_element` from `channeld` to `connectd`. Each real peer wire message is prefixed with `protocol_batch_element` so connectd can know the size of the message that were batched together. `connectd` intercepts `protocol_batch_element` messages and eats them (doesn’t forward them to peer) to get individual messages out of the batch. It needs this to be able to encrypt them individiaully. Afterwards it recombines the now encrypted messages into a single message to send over the wire to the peer. `channeld` remains responsible for making `start_batch` the first message of the message bundle.	2025-08-14 16:40:04 +09:30
Dusty Daemon	07f4bc39b1	splice: Add `start_batch` and an internal wire type We add `start_batch` to match t-bast’s splicing spec and we add a new internal wire type `WIRE_PROTOCOL_BATCH_ELEMENT` using the type number 0 Changelog-Added: support for `start_batch`	2025-08-14 16:40:04 +09:30
Rusty Russell	e27dee0fc4	connectd: fix nagle disabling logic. Our CORK logic was wrong, and it's better to use Nagle anyway: we disable Nagle just before sending timing-critical messages. Time for 100 (failed) payments: Before: 148.8573575 After: 10.7356977 Note this revealed a timing problem in test_reject_invalid_payload: we would miss the cause of the sendonion failure, and waitsendpay would be called after it had failed, so simply returns "Payment failure reason unknown". Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Fixed: Protocol: Removed 200ms latency from sending commit/revoke messages.	2025-05-08 14:01:38 +09:30
Rusty Russell	733efcf7dd	BOLTs: import spec additions for option_simple_close. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-03-18 14:30:58 +10:30
Rusty Russell	67c91a7e5c	BOLTs: Update to version with peer storage merged. Unfortunately a spec typo means the data fields are missing (PR pending), so we still patch those in. The message "your_peer_storage" got renamed to "peer_storage_retrieval", and the option "want_peer_backup_storage" was removed. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-EXPERIMENTAL: `experimental-peer-storage` now only advertizes feature 43, not 41.	2025-03-18 14:30:58 +10:30
Rusty Russell	5ae183b49b	connectd: attach input filtering for incoming dev_disconnect. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-01-27 11:07:04 +10:30
Rusty Russell	44c6a22e5f	dev_disconnect: rename to dev_disconnect_out, in preparation for incoming filters. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2025-01-27 11:07:04 +10:30
Alex Myers	d099f9fe5b	connectd: force our own channel gossip to more peers Large nodes were not always getting their own channel gossip out reliably. The number of peers we spam our own channel gossip to is limited to save large nodes on startup, but this should be relaxed slightly to ensure propagation. Changelog-Fixed: Own-channel gossip is broadcast to more peers on connect.	2024-11-28 14:54:08 +10:30
Rusty Russell	3d294f813d	connectd: limit to 10 connections at once. We wait until a connection fails, or a subd is connected to the peer, before letting another one through. This should prevent us from overwhelming lightningd on large nodes, but unlike the previous back-off, it's based on how fast lightningd is, not an arbitrary time. We also let one through each second, in case we're connecting to many, but not doing anything but gossip (e.g. 100 explicit connect commands). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Changed: Reconnecting to peers at startup should be significantly faster (dependent on machine speed).	2024-11-25 15:39:13 +10:30
Rusty Russell	5b92383b02	connectd: send self-advertizing gossip rather than having gossipd do it. It's now trivial for us to do this ourselves, since we have gossmap. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-11-25 15:39:13 +10:30
Rusty Russell	47584bd504	connectd: tie gossip query responses into ratelimiting code. A bit tricky, since we get more than one message at a time. However, this just means we go over quota for a bit, and will get caught when those are sent (we do this for a single message already, so it's not that much worse). Note: this not only limits sending, but it limits the actuall query processing, which is nice. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-07-10 12:21:19 +09:30
Rusty Russell	4a78d17748	connectd: do response to gossip queries, don't hand them to gossipd. This basically means moving the code from gossipd to connectd to handle these queries. This will get connectd have finer control over ratelimiting them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-07-10 12:21:19 +09:30
Rusty Russell	d60977f37f	connectd: use gossmap streaming interface. This is more efficient in a few ways: 1. It's trivial to get to the end of the gossip_store, we don't have to iterate. 2. It tends to be mmaped so we don't have to call pread(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-07-10 12:21:19 +09:30
Rusty Russell	401533667d	connectd: throttle streaming gossip for peers. We currently stream gossip as fast as we can, even if they start at timestamp 0. Instead, use a simple token bucket filter and only let them have 1MB per second (500 bytes per second for testing). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Protocol: connectd: we now throttle outgoing gossip at 1MB/second per peer.	2024-07-10 12:21:19 +09:30
Rusty Russell	5e585d061f	connectd: log incoming onion message IO properly. I noticed we were missing this. Move logging up a level so it's easier to spot the omission. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-07-09 15:09:29 +02:00
Rusty Russell	01cd605cb1	connectd: fix missing peer close. We were getting the following message in test_feerate_stress: ``` 2024-07-08T02:15:45.5663941Z lightningd-2 2024-07-08T02:13:45.696Z BROKEN 0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Peer did not close, forcing close ``` I can reproduce it locally if I run the test enough, and finally found the issue by printing the status of the fd when we time it out (using routines from connectd.c). The peer fd alternates between reading and writing. When we go to discard it, we wake the write queue, so write_to_peer() get called. It won't shutdown the socket if there are still subds attached, and will wait again for a read. The last subd exit has to also wake the write queue if we're draining, so it can do the io_sock_shutdown. Otherwise, we hit the timeout, causing the message above. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-07-09 18:03:44 +09:30
Rusty Russell	002dc60b33	Gossip: BOLT catch, remove initial_routing_sync. Everyone sends a gossip_timestamp_filter message these days to start gossip. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-06-19 15:54:24 +09:30
Rusty Russell	06cf5ac841	Doc: update bolts to assume gossip_queries under the new meaning. Everyone understands gossip_queries now, but peers leave it unset to indicate they have nothing useful to say. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-06-19 15:54:24 +09:30
Rusty Russell	5d061c4cf4	global: remove tags from BOLT quotes now dual-funding is in master A few of them had minor wording changes, too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-05-09 16:14:23 -05:00
Rusty Russell	d3dbcf03fa	channeld: close an unimportant connection when fds get low. We use a crude heuristic: if we were trying to contact them, it's a "deliberate" connection, and should be preserved. Changelog-Changed: connectd: prioritize peers with channels (and log!) if we run low on file descriptors. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-05-09 01:23:46 -05:00
Rusty Russell	3bfe622413	connectd: log when we fail to receive an fd from lightningd. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-05-09 01:23:46 -05:00
Rusty Russell	e0e879c003	common: remove type_to_string files altogther. This means including <common/utils.h> where it was indirectly included. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-03-20 13:51:48 +10:30
Rusty Russell	07cd4a809b	gossipd: remove spam handling. We weakened this progressively over time, and gossip v1.5 makes spam impossible by protocol, so we can wait until then. Removing this code simplifies things a great deal! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Removed: Protocol: we no longer ratelimit gossip messages by channel, making our code far simpler.	2024-02-04 09:24:44 +10:30
Rusty Russell	db6f0da3b3	connectd: separate routine to inject message without closing connection. We will want this to send private channel_updates direct to peer. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2024-01-31 14:47:33 +10:30

1 2 3

132 Commits