We request chunks concurrently. This makes header-sync much faster
when we are many blocks behind.
notes:
- all chunks are downloaded from the same interface, just for simplicity
- we request up to 10 chunks concurrently (so 10*2016 headers)
- more chunks: higher memory requirements
- more chunks: higher concurrency => syncing needs fewer network round-trips
- if a chunk does not connect, bandwidth for all later chunks is wasted
- we can tweak the constant or make it dynamic or make it a configvar, etc, later
- without this, we progress the chain tip by around 1 chunk per second
- 52k blocks (1 year on mainnet) takes around 26 seconds
- this is probably not *that* interesting for mainnet,
but for testnet3, that sometimes has 200x the block-rate of mainnet,
it is extremely useful
- interface.tip is the server's tip.
- consider scenario:
- client has chain len 800_000, is up to date
- client goes offline
- suddenly there is a short reorg
e.g. blocks 799_998, 799_999, 800_000 are reorged
- client was offline for long time, finally comes back online again
- server tip is 1_000_000, tip_header does not connect to client's local chain
- PREVIOUSLY before commit, client would start backwards search
- first it asks for header 800_001, which does not connect
- then client asks for header ~600k, which checks
- client will do long binary search to find the forkpoint
- AFTER commit, client starts backwards search
- first it asks for header 800_001, which does not connect
- then client asks for header 799_999, etc
- that is, previously, on average, client did a short backwards search, followed by a long binary search
- now, on average, client does a longer backwards search, followed by a shorter binary search
- this works much nicer with the headers_cache
(- and thomasv said the old behaviour was not intentional)
We try to predict the next headers the interface will ask for,
and request them ahead of time, to be kept in the headers_cache.
This saves network latency/round-trips, for a bit more memory usage
and in some cases for more bandwidth.
Note that due to PaddedRSTransport.WAIT_FOR_BUFFER_GROWTH_SECONDS,
latency saved here can be longer than "real" network latency.
This speeds up
- binary search greatly,
- backwards search to a small degree
(although not that much as its algorithm should be changed a bit to make it cache-friendly)
- catch-up greatly, if it's <10 blocks behind
What remains is to speed up catch-up in case we are behind by many thousands of block.
That behaviour is left unchanged here. The issue there is that we request chunks sequentially.
So e.g. 1 chunk (2016 blocks) per 1 second.
According to the asyncio documentation:
> When a task is cancelled, asyncio.CancelledError will be raised in the task at the next opportunity.
I guess the 'next opportunity' means the next await statement.
Here the issue is that the task was not awaiting ever.
Note: ElectrumX needs a similar patch
follow-up 7a7c0f1606
- before ref commit, we logged new headers for every interface
- after ref commit, we logged new header only for fastest interface
- now, we log new header for fastest interface and for main interface
- useful to see if main interface is very slow
- Wallet.make_unsigned_transaction takes a FeePolicy parameter
- fee sliders act on a FeePolicy instead of config
- different fee policies may be used for different purposes
- do not detect dust outputs in lnsweep, delegate that to lnwatcher
The "ssl.VERIFY_X509_STRICT" flag for openssl verification has been enabled by default in python 3.13+ (it was disabled before that).
see https://github.com/python/cpython/issues/107361
and https://discuss.python.org/t/ssl-changing-the-default-sslcontext-verify-flags/30230/16
We explicitly disable it for self-signed certs, thereby restoring the pre-3.13 defaults,
as it seems to break lots of servers.
For example, using python 3.13 (or setting `sslc.verify_flags |= ssl.VERIFY_X509_STRICT`),
- I can connect to `btc.electroncash.dk:60002:s`
- but not to `electrum.emzy.de:50002:s`
despite both using self-signed certs.
We should investigate more why exactly "strict" verification fails for some self-signed certs and not for others,
and make sure that at least newly generated certs adhere to the stricter requirements (e.g. update guide in e-x?).
Previously when encountering a CA-signed cert that failed verification with "Hostname mismatch",
we would
1. erroneously mark it as self-signed
2. save its cert to pin it
3. when connecting to it later, and being served a CA-signed cert, we would reject the connection
- I think this is because we use the saved cert (the peer cert, just the last cert in the chain) as if it was a root CA,
and then during the connection we try to verify against that root. This fails as we are served a different root then.
Error logged in step(3):
```
3.85 | W | i/interface.[wirg2tsto7rme7n26lkd3ivbvxmjyy2pktlozwjuep22jcsfsghfqbqd.onion:50002] | Cannot connect to main server due to SSL error (maybe cert changed compared to "/home/user/.electrum/testnet/certs/wirg2tsto7rme7n26lkd3ivbvxmjyy2pktlozwjuep22jcsfsghfqbqd.onion"). Exc: ConnectError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)'))
```
This commit fixes step(1), we won't mark the cert as self-signed, instead the error is propagated out and the connection closed.
```
35.05 | I | i/interface.[wirg2tsto7rme7n26lkd3ivbvxmjyy2pktlozwjuep22jcsfsghfqbqd.onion:50002] | disconnecting due to: ErrorGettingSSLCertFromServer(ConnectError(SSLCertVerificationError(1, "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'wirg2tsto7rme7n26lkd3ivbvxmjyy2pktlozwjuep22jcsfsghfqbqd.onion'. (_ssl.c:1007)")))
```
Compare:
- SSLCertVerificationError(1, "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Hostname mismatch, certificate is not valid for 'wirg2tsto7rme7n26lkd3ivbvxmjyy2pktlozwjuep22jcsfsghfqbqd.onion'. (_ssl.c:1007)")
- verify_code=62
- SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate (_ssl.c:1007)')
- verify_code=18
Note: the verify_code constants look stable, though they might be openssl-specific. I guess that's ok(?)
140540189c/include/openssl/x509_vfy.h.in (L224)
In fact, semantically it might be more correct to only trigger 'blockchain_updated' and not 'network_updated' here...
Anyway, 'blockchain_updated' should be triggered whenever the Blockchain object gets longer (or changes otherwise).
In particular, in qml, the NetworkOverview only updates the displayed height on 'blockchain_updated'.
```
191.73 | D | i/interface.[btc.electroncash.dk:60002] | (disconnect) trace for RPCError(-32603, 'internal error: bitcoind request timed out')
Traceback (most recent call last):
File "...\electrum\electrum\interface.py", line 514, in wrapper_func
return await func(self, *args, **kwargs)
File "...\electrum\electrum\interface.py", line 537, in run
await self.open_session(ssl_context)
File "...\electrum\electrum\interface.py", line 687, in open_session
async with self.taskgroup as group:
File "...\aiorpcX\aiorpcx\curio.py", line 304, in __aexit__
await self.join()
File "...\electrum\electrum\util.py", line 1309, in join
task.result()
File "...\electrum\electrum\interface.py", line 724, in request_fee_estimates
async with OldTaskGroup() as group:
File "...\aiorpcX\aiorpcx\curio.py", line 304, in __aexit__
await self.join()
File "...\electrum\electrum\util.py", line 1309, in join
task.result()
File "...\electrum\electrum\interface.py", line 1128, in get_estimatefee
res = await self.session.send_request('blockchain.estimatefee', [num_blocks])
File "...\electrum\electrum\interface.py", line 169, in send_request
response = await util.wait_for2(
File "...\electrum\electrum\util.py", line 1383, in wait_for2
return await asyncio.ensure_future(fut, loop=get_running_loop())
File "...\aiorpcX\aiorpcx\session.py", line 540, in send_request
return await self._send_concurrent(message, future, 1)
File "...\aiorpcX\aiorpcx\session.py", line 512, in _send_concurrent
return await future
aiorpcx.jsonrpc.RPCError: (-32603, 'internal error: bitcoind request timed out')
```
```
93.60 | E | asyncio | Task exception was never retrieved
future: <Task finished name='Task-7385' coro=<Interface.get_estimatefee() done, defined at ...\electrum\electrum\interface.py:1123> exception=RPCError(-32603, 'internal error: bitcoind request timed out')>
Traceback (most recent call last):
File "...\electrum\electrum\interface.py", line 1132, in get_estimatefee
res = await self.session.send_request('blockchain.estimatefee', [num_blocks])
File "...\electrum\electrum\interface.py", line 169, in send_request
response = await util.wait_for2(
File "...\electrum\electrum\util.py", line 1383, in wait_for2
return await asyncio.ensure_future(fut, loop=get_running_loop())
File "...\aiorpcX\aiorpcx\session.py", line 540, in send_request
return await self._send_concurrent(message, future, 1)
File "...\aiorpcX\aiorpcx\session.py", line 512, in _send_concurrent
return await future
aiorpcx.jsonrpc.RPCError: (-32603, 'internal error: bitcoind request timed out')
```
A new config API is introduced, and ~all of the codebase is adapted to it.
The old API is kept but mainly only for dynamic usage where its extra flexibility is needed.
Using examples, the old config API looked this:
```
>>> config.get("request_expiry", 86400)
604800
>>> config.set_key("request_expiry", 86400)
>>>
```
The new config API instead:
```
>>> config.WALLET_PAYREQ_EXPIRY_SECONDS
604800
>>> config.WALLET_PAYREQ_EXPIRY_SECONDS = 86400
>>>
```
The old API operated on arbitrary string keys, the new one uses
a static ~enum-like list of variables.
With the new API:
- there is a single centralised list of config variables, as opposed to
these being scattered all over
- no more duplication of default values (in the getters)
- there is now some (minimal for now) type-validation/conversion for
the config values
closes https://github.com/spesmilo/electrum/pull/5640
closes https://github.com/spesmilo/electrum/pull/5649
Note: there is yet a third API added here, for certain niche/abstract use-cases,
where we need a reference to the config variable itself.
It should only be used when needed:
```
>>> var = config.cv.WALLET_PAYREQ_EXPIRY_SECONDS
>>> var
<ConfigVarWithConfig key='request_expiry'>
>>> var.get()
604800
>>> var.set(3600)
>>> var.get_default_value()
86400
>>> var.is_set()
True
>>> var.is_modifiable()
True
```
15.00 | E | i/interface.[electrum.blockstream.info:50002] | Exception in run: ProtocolError(-32600, 'ill-formed response error object: cannot estimate fee for 25 blocks')
Traceback (most recent call last):
File "...\electrum\electrum\util.py", line 1261, in wrapper
return await func(*args, **kwargs)
File "...\electrum\electrum\interface.py", line 516, in wrapper_func
return await func(self, *args, **kwargs)
File "...\electrum\electrum\interface.py", line 539, in run
await self.open_session(ssl_context)
File "...\electrum\electrum\interface.py", line 689, in open_session
async with self.taskgroup as group:
File "...\aiorpcX\aiorpcx\curio.py", line 304, in __aexit__
await self.join()
File "...\electrum\electrum\util.py", line 1423, in join
task.result()
File "...\electrum\electrum\interface.py", line 726, in request_fee_estimates
async with OldTaskGroup() as group:
File "...\aiorpcX\aiorpcx\curio.py", line 304, in __aexit__
await self.join()
File "...\electrum\electrum\util.py", line 1423, in join
task.result()
File "...\electrum\electrum\interface.py", line 1128, in get_estimatefee
res = await self.session.send_request('blockchain.estimatefee', [num_blocks])
File "...\electrum\electrum\interface.py", line 171, in send_request
response = await asyncio.wait_for(
File "...\Python310\lib\asyncio\tasks.py", line 408, in wait_for
return await fut
File "...\aiorpcX\aiorpcx\session.py", line 540, in send_request
return await self._send_concurrent(message, future, 1)
File "...\aiorpcX\aiorpcx\session.py", line 512, in _send_concurrent
return await future
File "...\aiorpcX\aiorpcx\jsonrpc.py", line 721, in receive_message
item, request_id = self._protocol.message_to_item(message)
File "...\aiorpcX\aiorpcx\jsonrpc.py", line 273, in message_to_item
return cls._process_response(payload)
File "...\aiorpcX\aiorpcx\jsonrpc.py", line 220, in _process_response
raise cls._error(code, message, False, request_id)
aiorpcx.jsonrpc.ProtocolError: (-32600, 'ill-formed response error object: cannot estimate fee for 25 blocks')
The Qt GUI refreshes all tabs on 'blockchain_updated', which is expensive for very large wallets.
Previously, on every new block, the GUI would get one notification for each connected interface,
now only the fastest interface triggers it.
(was testing with a wallet where refreshing all tabs takes 15 seconds, and 10*15 seconds is
even more annoying...)
Note in particular that _RSClient.__aexit__ calls session.close()
so a lot of time can pass between interface.taskgroup raising and
handle_disconnect seeing the exception.
related https://github.com/spesmilo/electrum/issues/7677
closes https://github.com/spesmilo/electrum/issues/7677
```
E/n | network | taskgroup died.
Traceback (most recent call last):
File "/opt/electrum/electrum/network.py", line 1204, in main
[await group.spawn(job) for job in self._jobs]
File "/home/voegtlin/.local/lib/python3.8/site-packages/aiorpcx/curio.py", line 297, in __aexit__
await self.join()
File "/opt/electrum/electrum/util.py", line 1255, in join
task.result()
File "/opt/electrum/electrum/network.py", line 1277, in _maintain_sessions
await maintain_main_interface()
File "/opt/electrum/electrum/network.py", line 1268, in maintain_main_interface
await self._ensure_there_is_a_main_interface()
File "/opt/electrum/electrum/network.py", line 1245, in _ensure_there_is_a_main_interface
await self._switch_to_random_interface()
File "/opt/electrum/electrum/network.py", line 648, in _switch_to_random_interface
await self.switch_to_interface(random.choice(servers))
File "/opt/electrum/electrum/network.py", line 714, in switch_to_interface
await i.taskgroup.spawn(self._request_server_info(i))
File "/home/voegtlin/.local/lib/python3.8/site-packages/aiorpcx/curio.py", line 204, in spawn
self._add_task(task)
File "/home/voegtlin/.local/lib/python3.8/site-packages/aiorpcx/curio.py", line 150, in _add_task
raise RuntimeError('task group terminated')
RuntimeError: task group terminated
```
I believe the "suppress spurious cancellations" block was added as SilentTaskGroup raised
CancelledError instead of RuntimeError for this scenario.
aiorpcx 0.20 changed the behaviour/API of TaskGroups.
When used as a context manager, TaskGroups no longer propagate
exceptions raised by their tasks. Instead, the calling code has
to explicitly check the results of tasks and decide whether to re-raise
any exceptions.
This is a significant change, and so this commit introduces "OldTaskGroup",
which should behave as the TaskGroup class of old aiorpcx. All existing
usages of TaskGroup are replaced with OldTaskGroup.
closes https://github.com/spesmilo/electrum/issues/7446
I think this was originally needed due to incorrect management of group lifecycles,
which our current code is doing better.
also note that if we needed this, in newer aiorpcx, the name of
the field was ~changed from `_closed` to `joined`:
239002689a