Commit Graph

199 Commits

Author SHA1 Message Date
Rusty Russell
bcce29eeb0 connectd: remove unused flag to connect_init.
We haven't announced websocket addresses for some time!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-01-20 19:32:42 +10:30
Rusty Russell
25fb75e4be pytest: restore and fix disabled test test_excluded_adjacent_routehint.
1. It was flaky, probably because it didn't wait for the remote update_channel.
2. Rusty applied a fix in 5f664dac77, not clear if it worked.
3. Christian disabled it altogether in 23ce9a947d.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-01-08 22:33:19 +10:30
Rusty Russell
97e58e41f6 connectd: fix race when we supply a new address.
This shows up as a flake in test_route_by_old_scid:


```
        # Now restart l2, make sure it remembers the original!
        l2.restart()
>       l2.rpc.connect(l1.info['id'], 'localhost', l1.port)

tests/test_splicing.py:554: 
...
>           raise RpcError(method, payload, resp['error'])
E           pyln.client.lightning.RpcError: RPC call failed: method: connect, payload: {'id': '0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518', 'host': 'localhost', 'port': 33837}, error: {'code': 400, 'message': 'Unable to connect, no address known for peer'}
```

This is because it's already (auto)connecting, and fails.  This
failure is reported, before we've added the new address (once we add
the new address, we connect fine, but it's too late!):

```
lightningd-2 2025-12-08T02:39:18.241Z DEBUG   gossipd: REPLY WIRE_GOSSIPD_NEW_BLOCKHEIGHT_REPLY with 0 fds
lightningd-2 2025-12-08T02:39:18.320Z DEBUG   0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Initializing important peer with 0 addresses
lightningd-2 2025-12-08T02:39:18.320Z DEBUG   0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Failed connected out: Unable to connect, no address known for peer
lightningd-2 2025-12-08T02:39:18.344Z DEBUG   0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Will try reconnect in 1 seconds
lightningd-2 2025-12-08T02:39:18.344Z DEBUG   035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d-connectd: Initializing important peer with 1 addresses
lightningd-2 2025-12-08T02:39:18.344Z DEBUG   035d2b1192dfba134e10e540875d366ebc8bc353d5aa766b80c090b39c3a5d885d-connectd: Connected out, starting crypto
lightningd-2 2025-12-08T02:39:18.344Z DEBUG   0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Adding 1 addresses to important peer
lightningd-2 2025-12-08T02:39:18.345Z DEBUG   0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: Connected out, starting crypto
{'run_id': 256236335046680576, 'github_repository': 'ElementsProject/lightning', 'github_sha': '555f1ac461d151064aad6fc83b94a0290e2e9d5d', 'github_ref': 'refs/pull/8767/merge', 'github_ref_name': 'HEAD', 'github_run_id': 20013689862, 'github_head_ref': 'fixup-backfill-bug', 'github_run_number': 14774, 'github_base_ref': 'master', 'github_run_attempt': '1', 'testname': 'test_route_by_old_scid', 'start_time': 1765161493, 'end_time': 1765161558, 'outcome': 'fail'}
lightningd-2 2025-12-08T02:39:18.421Z DEBUG   022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-hsmd: Got WIRE_HSMD_ECDH_REQ
lightningd-2 2025-12-08T02:39:18.421Z DEBUG   hsmd: Client: Received message 1 from client
lightningd-2 2025-12-08T02:39:18.453Z DEBUG   022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-hsmd: Got WIRE_HSMD_ECDH_REQ
lightningd-2 2025-12-08T02:39:18.453Z DEBUG   hsmd: Client: Received message 1 from client
--------------------------- Captured stdout teardown ---------------------------
```
2026-01-08 22:33:19 +10:30
Rusty Russell
213cbba5bf lightningd: allow filtering on custommsg hook too.
Changelog-Added: Plugins: "filters" can be specified on the `custommsg` hook to limit what message types the hook will be called for.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2025-11-20 16:30:50 +10:30
Rusty Russell
55d622bd0e lightningd: fix race with mutual connect.
65dccea5bd "pytest: fix flake in test_reconnect_signed" accidentally
introduced a bug, where the connect command may not return.

If we call "connect" while a connection is still being processed
through the peer_connected hooks, we would call peer_channels_cleanup(),
which (if the peer has no channels) would free the peer.

Then when the peer_connected hook returned, it would lookup the peer,
see it was gone, and silently return.  The connect_succeeded() function
was never called, and the connect command never woken.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-None: bug introduced this release.
2025-11-19 16:04:21 +10:30
Rusty Russell
96fd13a811 lightnind: add connectd's reported events to the db.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2025-11-12 13:58:43 +10:30
Rusty Russell
21ad33151c connected: tell lightningd if we didn't find an address we could even *try* to connect to.
This is important: if it's tor-only and we don't have a proxy, we will fail
to connect, but it's no indication that the node is unreachable.  Same with
IPv6.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2025-11-12 13:58:43 +10:30
Rusty Russell
88b9b0bc28 connectd: report ping latencies (from ping probes) to lightningd.
(Uninitialize ping_start on manual ping fixed by Alex Myers)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2025-11-12 13:58:43 +10:30
Rusty Russell
0f07578c3f connectd: return reason, connect time to lightningd on connection results.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2025-11-12 13:58:43 +10:30
Rusty Russell
6e5cb299dd global: remove unnecessary includes from C files.
Basically, `devtools/reduce-includes.sh */*.c`.

Build time from make clean (RUST=0) (includes building external libs):

Before:
	real    0m38.944000-40.416000(40.1131+/-0.4)s
	user    3m6.790000-17.159000(15.0571+/-2.8)s
	sys     0m35.304000-37.336000(36.8942+/-0.57)s
After:
	real    0m37.872000-39.974000(39.5466+/-0.59)s
	user    3m1.211000-14.968000(12.4556+/-3.9)s
	sys     0m35.008000-36.830000(36.4143+/-0.5)s

Build time after touch config.vars (RUST=0):

Before:
	real    0m19.831000-21.862000(21.5528+/-0.58)s
	user    2m15.361000-30.731000(28.4798+/-4.4)s
	sys     0m21.056000-22.339000(22.0346+/-0.35)s

After:
	real    0m18.384000-21.307000(20.8605+/-0.92)s
	user    2m5.585000-26.843000(23.6017+/-6.7)s
	sys     0m19.650000-22.003000(21.4943+/-0.69)s

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2025-10-23 06:44:04 +10:30
Rusty Russell
f6a4e79420 global: remove unnecessary includes from headers.
Each header should only include the other headers it needs to compile;
`devtools/reduce-includes.sh */*.h` does this.  The C files then need
additional includes if they don't compile.

And remove the entirely useless wire/onion_wire.h, which only serves to include wire/onion_wiregen.h.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2025-10-23 06:44:04 +10:30
Rusty Russell
0d97631075 connectd: simplify logic, and add a "reconnected" message.
One issue we have in CI is reconnection races: if an incoming
connection arrives while an outgoing one is negotiated, we close the
outgoing one and issue a disconnect, which fails any connect attempts.

By sending a "reconnected" message instead of disconnect/connect we
can avoid disturbing in-progress connection attempts which happens in CI
quite a bit.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2025-10-01 12:12:56 +09:30
Rusty Russell
5f5440383d lightningd: fix race with crossover pings.
We cannot use subd_req() here: replies will come out of order, and the
we should not simply assign the reponses in FIFO order.

Changelog-Fixed: lightningd: don't get confused with parallel ping commands.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2025-08-14 17:35:39 +09:30
Rusty Russell
65dccea5bd pytest: fix flake in test_reconnect_signed
We can fail the fundchannel because of a reconnect race: funder tells
us to reconnect so fast that we are still cleaning up from last time.

We deliberately defer clean up, to give the subds a chance to process any
final messages.  However, on reconnection we force them to exit immediately:
but this causes new `connect` commands to see the exit and fail.

The workaround is to do this cleanup when a `connect` command is
issued (as well as the current case, which covers an automatic
reconnection or an incoming reconnection)

```
2025-05-09T00:40:37.1769508Z     def test_reconnect_signed(node_factory):
2025-05-09T00:40:37.1770273Z         # This will fail *after* both sides consider channel opening.
2025-05-09T00:40:37.1770850Z         disconnects = ['<WIRE_FUNDING_SIGNED']
2025-05-09T00:40:37.1771298Z         if EXPERIMENTAL_DUAL_FUND:
2025-05-09T00:40:37.1771735Z             disconnects = ['<WIRE_COMMITMENT_SIGNED']
2025-05-09T00:40:37.1772155Z     
2025-05-09T00:40:37.1772598Z         l1 = node_factory.get_node(may_reconnect=True, disconnect=disconnects)
2025-05-09T00:40:37.1773210Z         l2 = node_factory.get_node(may_reconnect=True)
2025-05-09T00:40:37.1773632Z     
2025-05-09T00:40:37.1773917Z         l1.fundwallet(2000000)
2025-05-09T00:40:37.1774268Z     
2025-05-09T00:40:37.1774611Z         l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
2025-05-09T00:40:37.1775151Z >       l1.rpc.fundchannel(l2.info['id'], CHANNEL_SIZE)
2025-05-09T00:40:37.1775388Z 
2025-05-09T00:40:37.1775485Z tests/test_connection.py:667:
...
2025-05-09T00:40:37.1799527Z >           raise RpcError(method, payload, resp['error'])
2025-05-09T00:40:37.1800993Z E           pyln.client.lightning.RpcError: RPC call failed: method: fundchannel, payload: {'id': '022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59', 'amount': 50000, 'announce': True}, error: {'code': -1, 'message': 'Disconnected', 'data': {'id': '022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59', 'method': 'openchannel_update'}}
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2025-05-11 11:25:40 +09:30
Rusty Russell
0a94f3b570 connectd: remove DNS seed lookups.
DNS seeds have been down/offline for a while, and this code (which
blocks!) has been a source of trouble.  We should probably use a
canned set of "known nodes" if we want to bootstrap.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: Protocol: we no longer use DNS seeds for peer lookup fallbacks.
Fixes: https://github.com/ElementsProject/lightning/issues/7913
2025-05-08 12:54:09 +09:30
Rusty Russell
4e7ba96729 lightningd: don't kill onchaind if we are forcing a disconnect.
We thought it was a good idea to terminate channel subds, but we included onchaind as well!

```
2025-01-29T21:31:46.053Z UNUSUAL 03…00-channeld-chan#255683: Adding HTLC 1737 too slow: killing connection
2025-01-29T21:31:46.053Z INFO    03…00-chan#255683: Peer transient failure in CHANNELD_NORMAL: Adding HTLC timed out: killed connection
2025-01-29T21:31:46.053Z DEBUG   03…00-channeld-chan#255683: Status closed, but not exited. Killing
2025-01-29T21:31:46.054Z DEBUG   03…00-chan#255683: Failing HTLC 1738 due to peer death
2025-01-29T21:31:46.058Z DEBUG   03…00-chan#255683: Failing HTLC 1737 due to peer death
2025-01-29T21:31:46.058Z DEBUG   03…00-chan#255673: Forcing disconnect due to One channel had an error
2025-01-29T21:31:46.059Z DEBUG   03…00-onchaind-chan#255673: Status closed, but not exited. Killing
2025-01-29T21:31:46.093Z DEBUG   03…00-connectd: disconnect_peer
2025-01-29T21:31:46.093Z DEBUG   03…00-lightningd: peer_disconnect_done
```

Reported-by: @whitslack
Fixes: https://github.com/ElementsProject/lightning/issues/8055
Changelog-Fixed: onchaind: don't die if we fail an unrelated channel with the same peer.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2025-02-10 19:19:12 -06:00
Rusty Russell
e655f69b73 lightningd: notification for onionmessage_forward_fail.
Changelog-Added: Plugins: new notification `onionmessage_forward_fail`.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-12-05 17:38:16 +10:30
Rusty Russell
8e4b589a9e connectd: message to tell lightningd if we couldn't forward an onion message.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-12-05 17:38:16 +10:30
Rusty Russell
faf7ae6ad4 pytest: add test for connection ratelimiting.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-11-25 15:39:13 +10:30
Rusty Russell
15950bb7d4 connectd: reconnect for non-transient connections.
Rather than have lightningd call us repeatedly to try to connect, have
it tell us what peers are transient and aren't, and connectd will
automatically try to maintain that connection.

There's a new "downgrade_peer" message to tell it a peer is now
transient: to make it non-transient we simply tell connectd to
connect as a non-transient.

The first time, I missed that dual_open_control does its own state
transitions :(

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: `connectd` now handles maintaining/reconnecting to important peers, and we remember the last successful address we connected to.
2024-11-25 15:39:13 +10:30
Rusty Russell
64af5db45c lightningd: generalize peer_any_channel to filter on entire channel, not just state.
We're going to use this to ask if there are any channels which make it
important to reconnect to the peer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-11-25 15:39:13 +10:30
Rusty Russell
4ee59e7a49 connectd: expose --dev-no-reconnect and --dev-fast-reconnect options.
Once connectd is controlling reconnections, it'll need these.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-11-25 15:39:13 +10:30
Rusty Russell
23dc10cf81 connectd: get our own addresses to contact node from node_announcements.
Let lightningd feed us hints to try first, but we can extract the
addresses from node_announcement messages ourselves.

(Lightningd used to ask gossipd on our behalf: this is far simpler!)

One side effect of this is that we don't hand back address hints given to us
by lightningd: it would use these again for reconnecting.  This is breaks
test_sendpay_grouping, so we disable it temporarily.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-11-25 15:39:13 +10:30
Rusty Russell
b525207e5c lightningd: track whether we're supposed to be throttling gossip.
We're going to need this for the next commit.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-08-12 16:30:29 +09:30
ShahanaFarooqui
b485a026f7 rpc: Removing description from json_command struct 2024-07-31 14:42:58 +09:30
ShahanaFarooqui
89c182e2be rpc: Removing category and verbose from json_command struct 2024-07-31 14:42:58 +09:30
Rusty Russell
c68204a32a lightningd: store our id as a struct pubkey as well as struct node_id.
We convert it in various places, so do that only once.  Also, the name
"id" is a little curt.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-07-23 09:54:47 +09:30
Rusty Russell
19af516dcb lightningd: tell connectd about all scids.
When we set them (i.e. at lockin), when we fire up channeld (for
aliases, which we create at channel init, but aren't really useful
until we have finished channel opening), and at startup.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-07-10 13:34:00 +02:00
Rusty Russell
f122c0beb4 connectd: include map of scid->peer node id.
This will let us fwd onion messages via scid, even if they're aliases.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-07-10 13:34:00 +02:00
Rusty Russell
b5f921ce0a lightningd: add routine to directly inject an onion message.
Unlike "sendonionmessage" which instructs us to send to a peer, this
process it locally (presumably, it contains the next hop).  This is
useful because it allows us to process an onion message which starts
with us (a legal case for a blinded path supplied by someone else!).
It also opens the door to bolt12 self-pay.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-07-10 13:34:00 +02:00
Rusty Russell
4a78d17748 connectd: do response to gossip queries, don't hand them to gossipd.
This basically means moving the code from gossipd to connectd to handle
these queries.

This will get connectd have finer control over ratelimiting them.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-07-10 12:21:19 +09:30
Rusty Russell
401533667d connectd: throttle streaming gossip for peers.
We currently stream gossip as fast as we can, even if they start at
timestamp 0.  Instead, use a simple token bucket filter and only let
them have 1MB per second (500 bytes per second for testing).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Protocol: connectd: we now throttle outgoing gossip at 1MB/second per peer.
2024-07-10 12:21:19 +09:30
Rusty Russell
155311b053 connectd: --dev-handshake-no-reply so we can test pending connections.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-05-14 18:16:26 -05:00
Rusty Russell
a9b7402910 pytest: test dropping transient connections.
Requires a hack to exhaust connectd fds and make us close a transient.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-05-14 18:16:26 -05:00
Rusty Russell
8268df9a4b connectd: implement "transient" connections.
Currently, anything which doesn't have a live channel is considered transient.
We free this first under stress, and also if they're still connecting.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-05-14 18:16:26 -05:00
Rusty Russell
ba922f9160 lightningd/connectd: remove --experimental-websocket-port
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Removed: Config `experimental-websocket-port` (deprecated 23.08, EOL 24.02)
2024-03-25 15:02:35 +10:30
Rusty Russell
e0e879c003 common: remove type_to_string files altogther.
This means including <common/utils.h> where it was indirectly included.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-03-20 13:51:48 +10:30
Rusty Russell
37d22f9141 global: change all type_to_string to fmt_X.
This has the benefit of being shorter, as well as more reliable (you
will get a link error if we can't print it, not a runtime one!).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-03-20 13:51:48 +10:30
Yaacov Akiba Slama
a81c1c51f6 Add no-reconnect-private option to disable automatic reconnect-attempts
to peers connected to our node with private channels only
2024-02-13 17:50:00 +01:00
Rusty Russell
06d59839ec connectd: channel_gossip when we've tried to connect to all peers.
It then waits 10 more seconds (for plugins to call setlease, especially)
before it will update a node_announcement.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-01-31 14:47:33 +10:30
Rusty Russell
db6f0da3b3 connectd: separate routine to inject message without closing connection.
We will want this to send private channel_updates direct to peer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-01-31 14:47:33 +10:30
Rusty Russell
50e7c71dc7 lightningd: mark all internal deprecations by version.
I did some CHANGELOG and git digging to see when these were deprecated, and
some were very old (v0.8.2!).  But since they didn't warn users loudly, I
chose to do so this release only.

I renamed ld's `deprecated_apis` to `deprecated_ok` to make sure I
caught them all.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2024-01-26 10:30:22 +10:30
Erik De Smedt
1ae16bfaa5 Create notification on customssg
Create a notification that is triggered when a `costummsg` is received.

Changelog-Added: Plugins: notification custommsg for receiving an unknown protocol message
2023-12-16 11:36:42 +10:30
Rusty Russell
3b622f06c3 lightningd: use param_check on all commands which do extra checks.
This makes `check` much more thorough, and useful.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: JSON-RPC: `check` now does much more checking on every command (not just basic parameter types).
2023-10-26 12:59:55 +10:30
Rusty Russell
bd80af5295 lightningd: allow sending of even messages.
It's your funeral!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2023-10-24 11:50:57 +10:30
Rusty Russell
ad7dcf381e lightningd: tell connectd about the custom messages.
We re-send whenever a plugin which allows them starts/finishes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2023-10-24 11:50:57 +10:30
Rusty Russell
a32b3b68e2 lightningd: stop all subds when we want to disconnect.
This prepares us for connectd disconnecting more gently: it can
wait for these to all disconnect.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2023-10-23 15:48:50 +10:30
Rusty Russell
5cf536d4b1 lightningd: make channel-query functions all take state.
It has the information we need, now.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2023-10-02 11:41:19 +10:30
Rusty Russell
0b4622bcbd lightningd/channel.h: clean up channel states.
We should use capability tests for states (can you add htlcs?) rather than vague
descriptions (are you closing?).

And as much as possible, use switch () statements to force us to think
about all the cases, especially when we add new states!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2023-10-02 11:41:19 +10:30
Rusty Russell
11ec03c6da lightningd: generalize peer_any_active_channel to peer_any_channel.
Take an optional filter function, so callers can say exactly what they want. 

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2023-10-02 11:41:19 +10:30