store exception in variable instead of using a bool flag
add default str to routing exceptions
Add separate exception class to handle fee related payment errors
add space
add gossip address field serialization, parsing and tests
fix linter
consolidate tests, fix intendation
refactor test in loops
add gossip address field serialization, parsing and tests
Ideally, given an on-chain backup, after the remote force-closes, we should be able to spend our anchor output,
to CPFP the remote commitment tx (assuming the channel used OPTION_ANCHORS).
To spend the anchor output, we need to be able to sign with the local funding_privkey.
Previously we derived the funding_key from the channel_seed (which comes from os.urandom).
Prior to anchors, there was no use case for signing with the funding_key given a channel backup.
Now with anchors, we should make its derivation deterministic somehow, in a way so that it can
be derived given just an on-chain backup.
- one way would be to put some more data into the existing OP_RETURN
- uses block space
- the OP_RETURNs can be disabled via "use_recoverable_channels"
- only the initiator can use OP_RETURNs (so what if channel is in incoming dir?)
- instead, new scheme for our funding_key:
- we derive the funding_privkey from the lnworker root secret (derived from our bip32 seed)
- for outgoing channels:
- lnworker_root_secret + remote_node_id + funding_tx_nlocktime
- for incoming channels:
- lnworker_root_secret + remote_node_id + remote_funding_pubkey
- a check is added to avoid reusing the same key between channels:
not letting to user open more than one channel with the same peer in a single block
- only the first 16 bytes of the remote_node_id are used, as the onchain backup OP_RETURNs only contain that
- as the funding_privkey cannot be derived from the channel_seed anymore, it is included in the
imported channel backups, which in turn need a new version defined
- a wallet db upgrade is used to update already stored imported cbs
- alternatively we could keep the imported cbs as-is, so no new version, no new funding_privkey field, as it is clearly somewhat redundant given on-chain backups can reconstruct it
- however adding the field seems easier
- otherwise the existing code would try to derive the funding_privkey from the channel_seed
- also note: atm there is no field in the imported backups to distinguish anchor channels vs static-remotekey channels
* sets the weight of htlc transactions to zero, thereby putting a zero
fee for the htlc transactions
* add inputs to htlc-tx for fee bumping
* switches feature flags
* disable anchor test vectors, which are now partially invalid
* in order to be able to sweep to_remote in an onchain backup scenario
we need to retain the private key for the payment_basepoint
* to facilitate the above, we open a channel derived from a static
secret (tied to the wallet seed), the static_payment_key combined with
the funding pubkey (multisig_key), which we can restore from the channel
closing transaction
- move some checks in two helper methods:
- invariant checks are performed in check_accepted_htlc
- mpp checks are performed in check_mpp_is waiting
- in order to avoid passing local_height to check_accepted_htlc,
the height in the error message is added by create_onion_error.
- do not fail because chain tip is stale if we already forwarded
- if we already forwarded a htlc and its cltv gets too close, do
not return, as this means we would never fulfill it
- Wait until HTLCs are irrevocably removed before cleaning up their
data structures (MPP and forwarding)
- keep methods maybe_cleanup_mpp and maybe_cleanup_forwarding separate
- perform cleanup in htlc_switch, so that process_unfulfilled_htlc
has less side effects
- In htlc_switch, we blank the onion_packet_hex field to signal that
an HTLC has been processed. An item of chan.unfulfilled_htlcs may
go through 4 stages:
- 1. not forwarded yet: (None, onion_packet_hex)
- 2. forwarded: (forwarding_key, onion_packet_hex)
- 3. processed: (forwarding_key, None), not irrevocably removed yet
- 4. done: (forwarding_key, None), irrevocably removed
- in test_lnpeer, an extra iteration of htlc_switch has been added to
trampoline forwarding tests
follow-up 6fb9207a44
> technically the fee estimate of a given bitcoind only changes on new blocks, but because of how we are asking for fee estimates and how we are taking the median of many interfaces, it can change at any time for us
note: running this in the peer taskgroup, for proper exception-handling and logging
If we accept a MPP and we forward the payment (trampoline or swap),
we need to persist the payment accepted status, or we might wrongly
release htlcs on the next restart.
lnworker.received_mpp_htlcs used to be cleaned up in maybe_cleanup_forwarding,
which only applies to forwarded payments. However, since we now
persist this dict, we need to clean it up also in the case of
payments received by us. This part of maybe_cleanup_forwarding has
been migrated to lnworker.maybe_cleanup_mpp
This will be useful if we decide to ship lntransport as a separate
package. It is also a conceptual cleanup.
Notes:
- lntransport still requires crypto.py
- parsing node id from a bolt11 invoice is not supported.
I just had a channel force-closed over a fee-estimate-disagreement :(
Scenario:
1. started electrum, opened wallet
2. waited around 10 seconds, opened the lightning channels overview, saw that channels are open/ready
3. after around 10 more seconds, scanned bolt11 invoice and tried to pay
4. channel got force-closed
Before this commit, we only call maybe_update_fee via lnwatcher callbacks.
These callbacks trigger on events such as "adb_set_up_to_date", "blockchain_updated", "network_updated", "fee".
In my case there was a race that all these events triggered *before* the channel got reestablished
(in fact before the peer handshake finished). And also, by chance there were none of these events after
the reestablish but before I sent the HTLC.
When I sent the HTLC, the channel counterparty (eclair) sent back an "error" msg that the feerates are
too different, which led us to do a local-force-close.
I have other channels in this wallet (with other peers), which reestablished faster and got lucky with
timing: the lnwatcher callbacks came just in time to trigger update_fee for them.
```
20241017T222847.598163Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | handshake done for 03ecef675be448b615e6176424070673ef8284e0fd19d8be062a6cb5b130a0a0d1@lightning.electrum.org:9740
20241017T222847.602594Z | DEBUG | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | Sending INIT
20241017T222847.641383Z | DEBUG | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | Received INIT
20241017T222847.655041Z | DEBUG | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | Sending CHANNEL_REESTABLISH
20241017T222847.658355Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | channel_reestablish (<redacted_shortchanid>): sent channel_reestablish with (next_local_ctn=157, oldest_unrevoked_remote_ctn=156)
20241017T222847.659524Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | reestablish_channel was called but channel <redacted_shortchanid> already in peer_state <PeerState.REESTABLISHING: 1>
20241017T222847.660491Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | reestablish_channel was called but channel <redacted_shortchanid> already in peer_state <PeerState.REESTABLISHING: 1>
20241017T222847.661442Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | reestablish_channel was called but channel <redacted_shortchanid> already in peer_state <PeerState.REESTABLISHING: 1>
20241017T222847.662768Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | reestablish_channel was called but channel <redacted_shortchanid> already in peer_state <PeerState.REESTABLISHING: 1>
20241017T222847.669875Z | DEBUG | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | Received QUERY_CHANNEL_RANGE
20241017T222847.690318Z | DEBUG | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | Received GOSSIP_TIMESTAMP_FILTER
20241017T222847.705782Z | DEBUG | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | Received CHANNEL_REESTABLISH
20241017T222847.707932Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | channel_reestablish (<redacted_shortchanid>): received channel_reestablish with (their_next_local_ctn=157, their_oldest_unrevoked_remote_ctn=156)
20241017T222847.712504Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | channel_reestablish (<redacted_shortchanid>): replayed 0 unacked messages. []
20241017T222847.716096Z | DEBUG | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | Sending CHANNEL_READY
20241017T222847.738709Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | saved remote channel_update gossip msg for chan <redacted_shortchanid>
20241017T222907.627447Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | lnpeer.pay len(route)=1
20241017T222907.628927Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | 0: edge=<redacted_shortchanid> hop_data=<OnionHopsDataSingle. payload={<redacted>}. hmac=None>
20241017T222907.629184Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | adding trampoline onion to final payload
20241017T222907.629405Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | lnpeer.pay len(t_route)=2
20241017T222907.629653Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | 0: t_node=03ecef675be448b615e6176424070673ef8284e0fd19d8be062a6cb5b130a0a0d1 hop_data=<OnionHopsDataSingle. payload={<redacted>}. hmac=<redacted>>
20241017T222907.629894Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | 1: t_node=<redacted> hop_data=<OnionHopsDataSingle. payload={<redacted>}. hmac=b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'>
20241017T222907.631495Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | starting payment. len(route)=1.
20241017T222907.633075Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | starting payment. htlc: UpdateAddHtlc(amount_msat=<redacted>, payment_hash=<redacted>, cltv_abs=<redacted>, timestamp=1729204147, htlc_id=66)
20241017T222907.633385Z | DEBUG | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | Sending UPDATE_ADD_HTLC
20241017T222907.635118Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | send_commitment. chan <redacted_shortchanid>. ctn: 157.
20241017T222907.643229Z | DEBUG | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | Sending COMMITMENT_SIGNED
20241017T222907.721929Z | DEBUG | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | Received ERROR
20241017T222907.722621Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | remote peer sent error [DO NOT TRUST THIS MESSAGE]: 'local/remote feerates are too different: remoteFeeratePerKw=<redacted_low_val> localFeeratePerKw=50125'. chan_id=<redacted>. is_known_chan_id=True
20241017T222907.734272Z | DEBUG | lnchannel.Channel.[<redacted_shortchanid>] | Setting channel state: OPEN -> FORCE_CLOSING
20241017T222907.825540Z | INFO | lnpeer.Peer.[LNWallet, 03ecef675b-98960573] | Disconnecting: GracefulDisconnect()
```
The messages are sometimes logged and sometimes shown to the user,
- for logging we might not want to truncate or have higher limits,
- but when shown to the user, we definitely want to truncate the error text.
It is simplest to just do the truncation here, at the lowest level.
Note that we usually prepend the error text with a header e.g. "[DO NOT TRUST THIS MESSAGE]"
and if the error text is too long, this header at the beginning might get "lost" in some way.
Hence we should truncate the error text.
Instead of some functions operating with hex strings,
and others using bytes, this consolidates most things to use bytes.
This mainly focuses on bitcoin.py and transaction.py,
and then adapts the API usages in other files.
Notably,
- scripts,
- pubkeys,
- signatures
should be bytes in almost all places now.
This gives more time for the client to come back online.
see https://github.com/spesmilo/electrum/issues/8940
- re note on submarine_swaps.py#L53:
lnpeer.Peer.maybe_fulfill_htlc only checks against MIN_FINAL_CLTV_DELTA_ACCEPTED(=144),
so this increased cltv_delta is not enforced when receiving the htlc on ln.
It is put in the invoice, so the sender is supposed to honour it ofc.
It would be nice to enforce it (make the check in maybe_fulfill_htlc dependent on
what was in the invoice).
related https://github.com/spesmilo/electrum/pull/8924
note: "forwarding_key" can be stored as False. I think this is ugly and
might be better to do a storage upgrade to change those values to None.
In maybe_fulfill_htlc, return two items: (preimage, (payment_key, callback)).
Rationale: The caller needs a payment_key only if we return a callback.
If we do not, the caller should use the payment_key that was previously
stored in channel.unfulfilled_htlcs
Note that for trampoline onions, payment_key may contain the inner
or outer secret.
We must not process incoming updates for a given channel until we ~finished reestablishing it.
Consider both parties have some unacked updates they want to replay during reestablish.
If Bob reacts to Alice's replayed stuff before he himself replays his stuff, madness ensues.
I think this should fix the remaining part of https://github.com/spesmilo/electrum/pull/8778
(timing issues when running the unit tests with py3.12)
This allows making any message handler async in lnpeer.
Note: `process_message` is only called from `_message_loop`.
There are(would be) basically three types of message handlers:
1. "traditional blocking msg handlers". non-async ones. When these handlers are called, `process_message` naturally blocks until the handler returns, which means `_message_loop` also blocks until the message is fully processed before starting the next iteration.
2. "async blocking msg handlers". async ones where we want the previous property, i.e. we want the `_message_loop` to wait until the handler finishes. We await the handler inside `process_message`, and `_message_loop` awaits `process_message`.
3. "async non-blocking msg handlers". async message handlers that can be spawned e.g. onto `Peer.taskgroup` and the loop is free to start processing subsequent messages. e.g. msg handlers that start a negotiation, such as `on_shutdown` and `on_open_channel`.
Any non-async message handler (`def on_...`) automatically goes into category 1.
An async message handler, by default, goes into category 2, "blocking";
to go into category 3 ("non-blocking"), we use the `runs_in_taskgroup` function decorator.
The existing logic of only updating the fee if it is not within 2x of
the current 2-block-eta does not work well for the current mempool.
The current mempool looks a bit weird: you need ~20 sat/vbyte to even get into it,
but there is only around 5 MB of txs paying >25 sat/vbyte.
The estimates look like this:
```
>>> config.fee_estimates
{144: 25764, 25: 27075, 10: 34538, 5: 34538, 2: 34538}
```
This commit changes the logic so that we send update_fee if the old rate is
- below 75% of the current 2-block-eta (instead of 50%), or
- below the 25-block-eta
In particular, lnd sends both chan_reest and then an error ("sync error").
It is critical we process the chan_reest and transition to WE_ARE_TOXIC before
processing the error (which would trigger us to force-close).
see spec 8a64c6a1ce/02-peer-protocol.md (L1504-L1506) :
> - upon reconnection:
> [...]
> - MUST transmit channel_reestablish for each channel.
> - MUST wait to receive the other node's channel_reestablish message before sending any other messages for that channel.
Assume Alice and Bob have a channel, and Alice is on an old state,
but neither of them knows the latter yet.
Timing scenarios:
1. Alice sends reest first, and Bob receives it before sending reest himself
- old code: Bob realises Alice is behind, Bob will force-close,
Bob won't send reest to Alice, so Alice does not learn she is behind
- new code: Bob realises Alice is behind, Bob will force-close,
Bob will still send reest to Alice, so Alice learns she is behind.
2. Bob sends reest first, and Alice receives it before sending reest herself
- old code: Alice learns she is behind. Alice won't send reest to Bob,
so Bob does not learn he is ahead, so Bob won't force-close.
- new code: Alice learns she is behind. Alice will still send reest to Bob
though with ctn=0 instead of actual. Bob learns he is ahead, so
Bob will force close.
3. Alice and Bob both send reest, and then they both receive what the other sent
- no change: Alice and Bob both learn Alice is behind. Bob will force-close.