Commit Graph

17592 Commits

Author SHA1 Message Date
Rusty Russell
68b30b1e5d askrene: have child make struct route_query internally.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-19 17:04:35 +10:30
Rusty Russell
b002824217 askrene: move route_query definition and functions into child/.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-19 17:04:35 +10:30
Rusty Russell
85c9179f77 askrene: expose additional_costs htable so child can access it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-19 17:04:35 +10:30
Rusty Russell
0f575ac85a askrene: remove non child-friendly fields from struct route_query.
Notably no access to the struct command and struct plugin.

Note: we actually *do* mess with askrene->reserves, but the previous code
used cmd to get to it.  Now we need to include a non-const pointer in
struct route_query.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-19 17:04:35 +10:30
Rusty Russell
ac9aa975ad askrene: make children use child_log() instead of rq_log.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-19 17:04:35 +10:30
Rusty Russell
33e2f0a47b askrene: move fork() entry point into its own file.
Now there's only one file clearly shared by both parent and child.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-19 17:04:35 +10:30
Rusty Russell
395261fc30 askrene: move fmt_flow_full from askrene.c into flow.c.
Weird that it was in askrene.c

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-19 17:04:35 +10:30
Rusty Russell
8775b62871 askrene: move routines only accessed by the child process into child/.
We want to make it clear when future generations edit the code, which
routines are called in the child (i.e. all the routing), and which in
the parent.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-19 17:04:35 +10:30
Rusty Russell
da2f77767c askrene: add child_log function so child can do logging.
We just shim rq_log for now, but we'll be weaning the child process off
that soon.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-19 17:04:35 +10:30
Rusty Russell
0ede29b81f askrene: fork before calling the route solver.
This is fairly simple.  We do all the prep work, fire off the child,
and it continues all the way to producing JSON output (or an error).
The parent then forwards it.

Limitations (fixed in successive patches):

1. Child logging currently gets lost.
2. We wait for the child, so this code is not a speedup.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-19 17:04:35 +10:30
Rusty Russell
e397b12282 askrene: make minflow() static, and remove unused linear_flow_cost.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-19 17:04:35 +10:30
Rusty Russell
d9774e73dc bitcoin: hash_scid and hash_scidd public functions.
We reimplemented this redundantly: hash_scid was called
short_channel_id_hash, so I obviously missed it.

Rename, and implement hash_scidd helper too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-19 17:04:35 +10:30
Rusty Russell
9bcac63414 libplugin: add command_finish_rawstr() for when we're simply repeating an entore response.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-19 17:04:35 +10:30
dovgopoly
edbad6cdca pytest: add tests for bcli getblockfrompeer retry path
Add `test_bcli_concurrent` to verify bcli handles concurrent requests while the `getblockfrompeer` retry path is active, simulating a pruned node scenario where `getblock` initially fails.

Add `test_bcli_retry_timeout` to verify lightningd crashes with a clear error message when we run out of `getblock` retries.
2026-02-18 14:16:29 +10:00
dovgopoly
2b39fc0cb4 bcli: replace magic numbers with constants 2026-02-18 14:16:29 +10:00
dovgopoly
4126f3b1fe bcli: refactor wait_and_check_bitcoind and run_bitcoin_cli to use shared execution
Extract `execute_bitcoin_cli` as shared function used by both `run_bitcoin_cli` and `wait_and_check_bitcoind`.
2026-02-18 14:16:29 +10:00
dovgopoly
d727946b14 bcli: return "not found" on any getblockhash exit status
Return "not found" on any `getblockhash` exit status. Previously, only exit code 8 (block height doesn't exist) returned "not found", while other exit codes returned an error. Now any non-zero exit status returns "not found" since any failure means the block is unavailable.
2026-02-18 14:16:29 +10:00
dovgopoly
57d60c025b bcli: remove unused async code after sync refactor
Remove the asynchronous execution infrastructure no longer needed after converting all bcli commands to synchronous execution. This includes removing the async callbacks, the pending request queue, etc.

Fix missing `close(from)` file descriptor leak in `run_bitcoin_cliv`.

Changelog-Changed: bcli plugin now uses synchronous execution, simplifying bitcoin backend communication and improving error handling reliability.
2026-02-18 14:16:29 +10:00
dovgopoly
3e979d1b20 pytest: fix bcli tests after sync refactor
Rewrite `test_bitcoin_failure` to reflect synchronous bcli behavior: the node now crashes on invalid bitcoind responses rather than retrying. Add `may_fail` and `broken_log` to handle expected crash.

Update `test_bitcoind_fail_first` stderr check to match the new error message format from `get_bitcoin_result`.

Update test mocks to use proper error format for "block not found".

Co-authored-by: ShahanaFarooqui <shahana.farooqui@gmail.com>
2026-02-18 14:16:29 +10:00
dovgopoly
7b1793f40d lightningd: add get_bitcoin_result for bcli response handling
Add `get_bitcoin_result` function that checks bcli plugin responses for errors and returns the result token. Previously, callbacks only detected errors when result parsing failed, ignoring the explicit error field from the plugin. Now we extract the actual error message from bcli, providing clearer reasoning when the plugin returns an error response.
2026-02-18 14:16:29 +10:00
dovgopoly
b5c300a82b bcli: convert getrawblockbyheight to synchronous execution
Also rename command_err_badjson to generic command_err helper, since error messages aren't always about bad JSON (e.g., "command failed" for non-zero exit).
2026-02-18 14:16:29 +10:00
dovgopoly
d06024cef7 bcli: convert estimatefees to synchronous execution
Add `command_err_badjson` helper for sync error handling, mirroring the async `command_err_bcli_badjson`. Store args string in `bcli_result` for consistent error messages.
2026-02-18 14:16:29 +10:00
dovgopoly
0de1350706 bcli: convert sendrawtransaction to synchronous execution 2026-02-18 14:16:29 +10:00
dovgopoly
a3e07f4f3a bcli: convert getutxout to synchronous execution 2026-02-18 14:16:29 +10:00
dovgopoly
f8c7a20403 bcli: convert getchaininfo to synchronous execution 2026-02-18 14:16:29 +10:00
dovgopoly
fad05200eb bcli: add synchronous run_bitcoin_cli for future refactor 2026-02-18 14:16:29 +10:00
Rusty Russell
963b353a30 connectd: use membuf for more efficient output queue.
This is exactly what membuf is for: it handles expansion much more
neatly.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-18 14:13:25 +10:30
Rusty Russell
afdc92fedf connectd: only do lazy transmission for *definitely* non-urgent messages.
Since we delay the others quite a lot (up to 1 second), it's better to consider
most messages "urgent" and worth immediately transmitting.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-18 14:13:25 +10:30
Rusty Russell
2436ee6f6f connectd: don't flush messages unless we have something important.
This replaces our previous nagle-based toggling.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-18 14:13:25 +10:30
Rusty Russell
8b90d40a75 connectd: pad messages with dummy pings if needed to make size uniform.
Messages are now constant.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: Protocol: we now pad all peer messages to make them the same length.
2026-02-18 14:13:25 +10:30
Rusty Russell
ca2d389920 devtools/gossipwith: don't count "padding" pings towards max-messages count.
We are about to use them to make our packet size constant, and this
will upset the tests.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-18 14:13:25 +10:30
Rusty Russell
d45bc2d56e connectd: don't toggle nagle on and off, leave it always off.
We're doing our own buffering now.

We leave the is_urgent() function for two commits in the future though.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-18 14:13:25 +10:30
Rusty Russell
c23b7a492d connect: switch to using io_write_partial instead of io_write.
This gives us finer control over write sizes: for now we just cap
the write size.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-18 14:13:25 +10:30
Rusty Russell
df1ae1d680 connectd: refactor to break up "encrypt_and_send".
Do all the special treatment of the message type first.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-18 14:13:25 +10:30
Rusty Russell
7577e59f6c connectd: refactor outgoing loop.
Give us a single "next message" function to call.  This will be useful
when we want to write more than one at a time.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-18 14:13:25 +10:30
Rusty Russell
42bdb2d638 CI: run tests in the wireshark group so we can test packet sizes
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-18 14:13:25 +10:30
Rusty Russell
369338347d pytest: add fixture for checking packet sizes.
This requires access to dumpcap.  On Ubuntu, at least, this means you
need to be in the "wireshark" group.

We may also need:
	sudo ethtool -K lo gro off gso off tso off

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-18 14:13:25 +10:30
Rusty Russell
cd7afb506a pytest: remove now-invalid test.
Commit 888745be16 (dev_disconnect:
remove @ marker.) in v0.11 in April 2022) removed the '@' marker from
our dev_disconnect code, but one test still uses it.

Refactoring this code made it crash on invalid input.  The test
triggered a db issue which has been long fixed, so I'm simply removing
it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-18 14:13:25 +10:30
Rusty Russell
4d030d83ce pytest: fix flae int test_fetchinvoice_autoconnect.
l3 doesn't just need to know about l2 (which it can get from the
channel_announcement), but needs to see the node_announcement.

Otherwise:

```
        l1, l2 = node_factory.line_graph(2, wait_for_announce=True,
                                         # No onion_message support in l1
                                         opts=[{'dev-force-features': -39},
                                               {'dev-allow-localhost': None}])
    
        l3 = node_factory.get_node()
        l3.rpc.connect(l1.info['id'], 'localhost', l1.port)
        wait_for(lambda: l3.rpc.listnodes(l2.info['id'])['nodes'] != [])
    
        offer = l2.rpc.call('offer', {'amount': '2msat',
                                      'description': 'simple test'})
>       l3.rpc.call('fetchinvoice', {'offer': offer['bolt12']})

tests/test_pay.py:4804: 
...	
>           raise RpcError(method, payload, resp['error'])
E           pyln.client.lightning.RpcError: RPC call failed: method: fetchinvoice, payload: {'offer': 'lno1qgsqvgnwgcg35z6ee2h3yczraddm72xrfua9uve2rlrm9deu7xyfzrcgqypq5zmnd9khqmr9yp6x2um5zcssxwz9sqkjtd8qwnx06lxckvu6g8w8t0ue0zsrfqqygj636s4sw7v6'}, error: {'code': 1003, 'message': 'Failed: could not route or connect directly to 033845802d25b4e074ccfd7cd8b339a41dc75bf9978a034800444b51d42b07799a: {"code":400,"message":"Unable to connect, no address known for peer"}'}
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-18 14:13:25 +10:30
Rusty Russell
29e0a1ddfe bkpr: limp along if we lost our db.
We can't really do decent bookkeeping any more, but don't crash!

```
bookkeeper: plugins/bkpr/recorder.c:178: find_txo_chain: Assertion `acct->open_event_db_id' failed.
bookkeeper: FATAL SIGNAL 6 (version v25.12)
0xaaaab7d51a7f send_backtrace
	common/daemon.c:38
0xaaaab7d51b2b crashdump
	common/daemon.c:83
0xffff8c0b07cf ???
	???:0
0xffff8bdf7608 __pthread_kill_implementation
	./nptl/pthread_kill.c:44
0xffff8bdacb3b __GI_raise
	../sysdeps/posix/raise.c:26
0xffff8bd97dff __GI_abort
	./stdlib/abort.c:79
0xffff8bda5cbf __assert_fail_base
	./assert/assert.c:96
0xffff8bda5d2f __assert_fail
	./assert/assert.c:105
0xaaaab7d41fd7 find_txo_chain
	plugins/bkpr/recorder.c:178
0xaaaab7d421fb account_onchain_closeheight
	plugins/bkpr/recorder.c:291
0xaaaab7d37687 do_account_close_checks
	plugins/bkpr/bookkeeper.c:884
0xaaaab7d38203 parse_and_log_chain_move
	plugins/bkpr/bookkeeper.c:1261
0xaaaab7d3871f listchainmoves_done
	plugins/bkpr/bookkeeper.c:171
0xaaaab7d4811f handle_rpc_reply
	plugins/libplugin.c:1073
0xaaaab7d4827b rpc_conn_read_response
	plugins/libplugin.c:1377
0xaaaab7d889a7 next_plan
	ccan/ccan/io/io.c:60
0xaaaab7d88f7b do_plan
	ccan/ccan/io/io.c:422
0xaaaab7d89053 io_ready
	ccan/ccan/io/io.c:439
```

Fixes: https://github.com/ElementsProject/lightning/issues/8854
Changelog-Fixed: Plugins: `bkpr_listbalances` no longer crashes if we lost our db, then do emergencyrecover and close a channel.
Reported-by: https://github.com/enaples
2026-02-17 12:10:26 +10:30
Rusty Russell
2e8261ef9e pytest: test for bkpr_listbalances after emergencyrecover.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-17 12:10:26 +10:30
Rusty Russell
b150309854 pytest: test for crash when we have dying channels and compact the gossip_store.
Before I fixed the handling of dying channels:

```
lightning_gossipd: gossip_store: can't read hdr offset 2362/2110: Success (version v25.12-279-gb38abe6-modded)
0x6537c19ecf3a send_backtrace
        common/daemon.c:38
0x6537c19f1a1d status_failed
        common/status.c:207
0x6537c19e557a gossip_store_get_with_hdr
        gossipd/gossip_store.c:527
0x6537c19e5613 check_msg_type
        gossipd/gossip_store.c:559
0x6537c19e5a36 gossip_store_set_flag
        gossipd/gossip_store.c:577
0x6537c19e5c82 gossip_store_del
        gossipd/gossip_store.c:629
0x6537c19e8ddd gossmap_manage_new_block
        gossipd/gossmap_manage.c:1362
0x6537c19e390e new_blockheight
        gossipd/gossipd.c:430
0x6537c19e3c37 recv_req
        gossipd/gossipd.c:532
0x6537c19ed22a handle_read
        common/daemon_conn.c:35
0x6537c19fbe71 next_plan
        ccan/ccan/io/io.c:60
0x6537c19fc174 do_plan
        ccan/ccan/io/io.c:422
0x6537c19fc231 io_ready
        ccan/ccan/io/io.c:439
0x6537c19fd647 io_loop
        ccan/ccan/io/poll.c:470
0x6537c19e463d main
        gossipd/gossipd.c:609
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-16 17:23:33 +10:30
Rusty Russell
acb8a8cc15 gossipd: dev-compact-gossip-store to manually invoke compaction.
And tests!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-16 17:23:33 +10:30
Rusty Russell
88f3f97b7c gossipd: reset dying_channels array after compact.
Reported-by: @daywalker90
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-16 17:23:33 +10:30
Rusty Russell
912b40aeff gossipd: compact when gossip store is 80% deleted records.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: `gossipd` now uses a `lightning_gossip_compactd` helper to compact the gossip_store on demand, keeping it under about 210MB.
2026-02-16 17:23:33 +10:30
Rusty Russell
15696d97bd gossipd: code to invoke compactd and reopen store.
This isn't called anywhere yet.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-16 17:23:33 +10:30
Rusty Russell
f56f8adcdf gossipd: lightningd/lightning_gossip_compactd
A new subprocess run by gossipd to create a compacted gossip store.

It's pretty simple: a linear compaction of the file.  Once it's done the amount it
was told to, then gossipd waits until it completes the last bit.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-16 17:23:33 +10:30
Rusty Russell
a966dd71ad common: expose gossip_store "header and type" single-read struct.
gossip_store.c uses this to avoid two reads, and we want to use it
elsewhere too.

Also fix old comment on gossip_store_readhdr().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-16 17:23:33 +10:30
Rusty Russell
1fb4da075f gossipd: put the last_writes array inside struct gossip_store.
This is the file responsible for all the writing, so it should be
responsible for the rewriting if necessary (rather than
gossmap_manage).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-16 17:23:33 +10:30
Rusty Russell
facf24b6ee devtools/gossmap-compress: create latest gossip_store version
This saves gossipd from converting it:

```
lightningd-1 2026-02-02T00:50:49.505Z DEBUG   gossipd: Time to convert version 14 store: 890 msec
```

Reducing node startup time from 1.4 seconds to 0.5 seconds.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2026-02-16 17:23:33 +10:30