Improved block sync speed

-A number of functions have been rewritten to be more optimized and faster: calculate_total, is_unique, convert_to_satoshi, get_input_addresses, processVoutAddresses, prepare_vout, prepare_vin
-Txes are now written to database via bulk writes which helps improve the sync speed and also controls memory usage with batching to write data once a certain threshold is reached
-update_address function changed to update_addresses since it now bulk writes the addresses in batches to improve sync speed and also controls memory usage with batching to write data once a certain threshold is reached
-The syncLoop function has been completely removed from the project and replaced with async library loops or even normal "for" loops in some cases which greatly improves sync speeds over large batches of data
-Fixed an issue with the flattened count of txes that is saved to the coinstats collection which could save incorrectly when using more than 1 thread
-Fixed an issue with the block sync which caused an unwanted delay when syncing less blocks than the amount of threads used to sync the data
-Fixed an issue with vout data processing that could sometimes populate data out of order
-Added a new sync.batch_size setting used to determine how many records (txes, addresses, addresstxes) should be saved in a single database transaction
-Added a new wait_for_bulk_database_save setting used to increase the block sync speed at the cost of not returning any error msgs for data that failed to save
-get_input_addresses function no longer returns in the exports section of the explorer.js file since it is only referenced in that file
-Updated explorerspec tests to use the newest function changes for any tests that needed to be updated

Special thanks to Karzo from Pepecoin for help with the bulkwrite code changes!
This commit is contained in:
Joe Uhren
2025-02-02 19:10:17 -07:00
parent 0b0ef817f1
commit 3a2f679201
10 changed files with 966 additions and 867 deletions
+15 -1
View File
@@ -1499,10 +1499,24 @@
// BALANCES : get the supply by running a query on the addresses collection and summing up all positive balances (potentially a long running query for blockchains with tons of addresses)
// TXOUTSET : retrieved from gettxoutsetinfo rpc cmd
"supply": "GETINFO",
// batch_size: The maximum number of records before saving data to the database.
// This value is used for syncing transactions, addresses and address transactions.
// Each record type is processed within a single block so batching only happens if there are more than `batch_size` txes in a single block for example.
// If the number of txes is lower than `batch_size` then all txes are saved in one batch.
// A higher batch_size can save data faster than having to do smaller batches but only up to a certain point since it also requires more memory and resources for larger batches and the optimal number depends entirely on your server's resources.
// A lower batch size generally ensures there will be no memory limitations although it can also slow down the sync process.
// It is recommended to leave this value alone unless you know what you are doing although some experimentation with different batch sizes using the benchmark script can often help determine the optimal setting for your server.
"batch_size": 5000,
// elastic_stack_size: If a "RangeError: Maximum call stack size exceeded" error occurs during a block sync (which can happen when dealing with large transactions with many addresses), the sync script will automatically be reloaded using a larger stack size value which increases memory usage based on this value.
// NOTE: If the first reload of the sync script still doesn't have enough memory to handle processing of a large transaction, the sync is smart enough to continue increasing the stack size by this value again and again until it finishes processing all blocks and then returns back to the default amount of memory for future blocks.
// It is recommended to leave this value alone unless you know what you are doing.
"elastic_stack_size": 4096
"elastic_stack_size": 4096,
// wait_for_bulk_database_save: Determine whether to wait for all records to be saved or just send the records without waiting for save confirmation when saving bulk data
// This setting only controls how to treat records that are bulk saved to the database which include txes, addresstxes and addresses
// If set to true, bulk transactions to the database will wait for save confirmation which results in a slower save time but also returns information about which records failed to save
// If set to false, bulk transactions to the database will not wait for save confirmation which results in a faster save time but will not return any error message for records that failed to save
// NOTE: If you want to sync data as fast as possible and are sure that your blockchain doesn't contain any problematic or unsupported data types then you can set this value to "false" to maximize the speed of the block sync
"wait_for_bulk_database_save": true
},
// captcha: a collection of settings that pertain to the captcha security used by different elements of the explorer