Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Protocol: we now re-transmit unseen funding transactions on startup, for more robustness.
This happens if the channel is *not* announcable yet. Then we hit the assertion
in funding_depth_cb that the txid is the same as the current funding.txid.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-EXPERIMENTAL: fixed crash when we splice a channel which hasn't been announced yet.
One issue we have in CI is reconnection races: if an incoming
connection arrives while an outgoing one is negotiated, we close the
outgoing one and issue a disconnect, which fails any connect attempts.
By sending a "reconnected" message instead of disconnect/connect we
can avoid disturbing in-progress connection attempts which happens in CI
quite a bit.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They can all call get_block_height(); the extra argument confused me and
I thought they were called before the block height was actually updated.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The value of `our_funds` inlightningd is the funds added to the channel during creation. Splicing is a quasi-creation event. This change makes our pending funds be considered funding funds at the moment of splice confirmation.
Changelog-None
We can fail the fundchannel because of a reconnect race: funder tells
us to reconnect so fast that we are still cleaning up from last time.
We deliberately defer clean up, to give the subds a chance to process any
final messages. However, on reconnection we force them to exit immediately:
but this causes new `connect` commands to see the exit and fail.
The workaround is to do this cleanup when a `connect` command is
issued (as well as the current case, which covers an automatic
reconnection or an incoming reconnection)
```
2025-05-09T00:40:37.1769508Z def test_reconnect_signed(node_factory):
2025-05-09T00:40:37.1770273Z # This will fail *after* both sides consider channel opening.
2025-05-09T00:40:37.1770850Z disconnects = ['<WIRE_FUNDING_SIGNED']
2025-05-09T00:40:37.1771298Z if EXPERIMENTAL_DUAL_FUND:
2025-05-09T00:40:37.1771735Z disconnects = ['<WIRE_COMMITMENT_SIGNED']
2025-05-09T00:40:37.1772155Z
2025-05-09T00:40:37.1772598Z l1 = node_factory.get_node(may_reconnect=True, disconnect=disconnects)
2025-05-09T00:40:37.1773210Z l2 = node_factory.get_node(may_reconnect=True)
2025-05-09T00:40:37.1773632Z
2025-05-09T00:40:37.1773917Z l1.fundwallet(2000000)
2025-05-09T00:40:37.1774268Z
2025-05-09T00:40:37.1774611Z l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
2025-05-09T00:40:37.1775151Z > l1.rpc.fundchannel(l2.info['id'], CHANNEL_SIZE)
2025-05-09T00:40:37.1775388Z
2025-05-09T00:40:37.1775485Z tests/test_connection.py:667:
...
2025-05-09T00:40:37.1799527Z > raise RpcError(method, payload, resp['error'])
2025-05-09T00:40:37.1800993Z E pyln.client.lightning.RpcError: RPC call failed: method: fundchannel, payload: {'id': '022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59', 'amount': 50000, 'announce': True}, error: {'code': -1, 'message': 'Disconnected', 'data': {'id': '022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59', 'method': 'openchannel_update'}}
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is a bit too much boilerplate for these, which mainly do the same
thing.
We add annotaitons to new_peer_fd so the compiler knows that it cannot
return NULL, otherwise with -O3 we get:
```
lightningd/peer_control.c: In function ‘peer_connected_hook_final’:
lightningd/peer_control.c:1388:28: error: ‘error’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
1388 | take(towire_connectd_peer_send_msg(NULL, &channel->peer->id,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
lightningd/peer_control.c:1313:19: note: ‘error’ was declared here
1313 | const u8 *error;
| ^~~~~
lightningd/peer_control.c: In function ‘peer_spoke’:
lightningd/peer_control.c:1999:28: error: ‘error’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
1999 | take(towire_connectd_peer_send_msg(NULL, &peer->id,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make: *** [Makefile:311: lightningd/peer_control.o] Error 1
```
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The updated API requires typed htables to explicitly state whether they
allow duplicates: for most cases we don't, but we've had issues in the
past.
This is a big patch, but mainly mechanical.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cleans up the API: we have two functions now, one which is explicitly for
"I'm failing this because I saw this tx onchain".
Now we can correctly report the tx which closed the channel (previously
we would always report our own tx(s)!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: JSON-RPC: `close` now correctly reports the txid of the remote onchain unilateral tx if it races with a peer close.
Changelog-Fixed: Protocol: we no longer try to spend anchors if a commitment tx is already mined (reported by @niftynei).
Fixes: #7526
If we connected out, remember that address. We always remember the last
address, but that may be an incoming address. This is explicitly the last
outgoing address which worked.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1024 is a common limit, and people are starting to hit that many channels, so we should increase it: twice the number of channels seems reasonable, though we only do this at restart time.
Changelog-Changed: lightningd: we now try to increase the number of file descriptors, if it's less than twice the number of channels at startup (and log if we cannot!).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We use the *same* callback for the funding tx, as well as for inflight dual-funding txs, as well as inflight splice txs. This is deeply confusing!
Instead, use explicit cbs for splicing and df. Once they're locked in, use the normal callback.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Update the lightningd <-> channeld interface with lots of new commands to needed to facilitate spicing.
Implement the channeld splicing protocol leveraging the interactivetx protocol.
Implement lightningd’s channel_control to support channeld in its splicing efforts.
Changelog-Added: Added the features to enable splicing & resizing of active channels.
Clean restart of daemon after a tx-abort is a nice way to work around
the 'persistent' disconnect that we t-bast noticed.
Changelog-Fixed: `dualopend`: Fix behavior for tx-aborts. No longer hangs, appropriately continues re-init of RBF requests without reconnction msg exchange.
After connecting 100,000 peers with one channel each (not all at
once!), we see various places where we exhibit O(N^2) behaviour.
Fix these by keeping a map of id->peer instead of a simple
linked-list.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We used to tell connectd to remember our connect delay, and hand it
back (increased if necessary).
Instead, simply record when we last tried to connect. If it was less
than 10 minutes ago, double delay (up to 5 minutes max), otherwise
reset delay to 1 second.
This covers all scenarios: whether we reconnect then immediately
disconnect, or never successfully connect, it doesn't matter.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes: #5453
Give them time to process any final messages! If there's a reconnect,
then we need to clean them up immediately of course.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This allows us to detect when lightningd hasn't seen our latest
disconnect/reconnect; in particular, we would hit the following pattern:
1. lightningd says to connect a subd.
2. connectd disconnects and reconnects.
3. connectd reads message, connects subd.
4. lightningd reads disconnect and reconnect, sends msg to connect to subd again.
5. connectd asserts because subd is alreacy connected.
This way connectd can tell if lightningd is talking about the previous
connection, and ignoere it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Before this patch:
1. connectd says it's connected (peer_connected)
2. we tell connectd we want to talk about each channel (peer_make_active)
3. connectd gives us an fd for each channel, and we connect it to a subd (peer_active)
4. OR, connectd says it sent something about a channel we didn't tell it about, with an fd (peer_active)
Now:
1. connectd says it's connected (peer_connected)
2. we start all appropriate subds and tell connectd to what channels/fds (peer_connect_subd).
3. if connectd says it sent something about a channel we didn't tell it about, we either tell
it to hang up (peer_final_msg), or connect a new opening daemon (peer_connect_subd).
This is the minimal-size patch, which is why we create socket pairs in
so many places to use the existing functions. Many cleanups are
possible, since the new flow is so simple.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
First, connectd tells us the peer has connected, and we call the connected hook,
and if it says it's fine, we are actually connected and we fire off notifications.
Of course, we could be disconnected while in the connected hook, and that would
mean we tell people about a connection which is no longer current.
Make this clear with a tristate: if we're not marked disconnected by
the time the hooks finish, we're good. It also gives us a cleaner
"connect" command return when we connected but disconnected before
processing.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is redundant now, since connectd only sends us this once we tell
it it's OK, but that's changing, so clean up now. This means that
connectd will be able to make *unsolicited* closes, if it needs to.
We share logic with peer_please_disconnect.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We have them split over common/param.c, common/json.c,
common/json_helpers.c, common/json_tok.c and common/json_stream.c.
Change that to:
* common/json_parse (all the json_to_xxx routines)
* common/json_parse_simple (simplest the json parsing routines, for cli too)
* common/json_stream (all the json_add_xxx routines)
* common/json_param (all the param and param_xxx routines)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Activate means a specific thing now (connectd said something), so avoid
confusing it with this function.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Either because lightningd tells us it wants to talk, or because the peer
says something about a channel.
We also introduce a behavior change: we disconnect after a failed open.
We might want to modify this later, but we it's a side-effect of openingd
not holding onto idle connections.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently intuit this by whether there's a subdaemon owning it.
But we're about to change the rules and allow connectd to hold idle
connections, so we need an explicit flag.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead of doing this weird chaining, just call them all at once and
use a reference counter.
To make it simpler, we return the subd_req so we can hang a destructor
off it which decrements after the request is complete.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is neater than what we had before, and slightly more general.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Changed: JSON_RPC: `sendcustommsg` now works with any connected peer, even when shutting down a channel.
We now let gossipd do it.
This also means there's nothing left in 'struct per_peer_state' to
send across the wire (the fds are sent separately), so that gets
removed from wire messages too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Enable non-dev builds to send custom messages.
Preserves 'dev-' for compat-enabled builds.
Changelog-Changed: JSON-RPC: moved dev-sendcustommsg to sendcustommsg