Enhance Mempool performance #226

tiram88 · 2023-07-18T17:32:20Z

This PR addresses the issue of the mempool being a bottleneck in the first testnet-11 experiment @ 10 BPS.

The previous monolithical design did lock both the Mempool and to some extent the virtual processor during every call to its public functions, through the Manager that owns it, leading to some long delays. The new general design is to split the functions into simpler atomic steps logically protected by a lock on the mempool when the latter is implied. This approach has following consequences:

The Manager owning the Mempool instance no longer is a simple pass-through locking and exposing mempool functions publicly. An important part of the mempool logic gets moved into the manager.
Some verification steps must sometimes be duplicated, particularly checking for double spends, since executing a Manager function is no longer atomic.
Some processes are reorganized and some behaviors do change, taking advantage of the potential this new design offers.

Another performance improvements are brought:

Processing batches of transaction in a topological order, forming sub-batches by levels of chained dependency
Parallel validation of sub-batches of transactions into the virtual processor, processing those in chunks formed up to a maximal mass so the virtual processor does lock for too long,
A more efficient algorithm for making room in the mempool when it is full and some transaction must be added. This process also has a modified behavior: the transaction with the lowest fee rate is removed, but (this is new) only among transactions having no chained dependency and not being a parent of the transaction getting added.

Validate and insert a single transaction

In order to bring a much finer lock granularity the function of validating and inserting a transaction into the mempool is split in 4 steps:

pre-validation (read lock on Mempool)
validation by the virtual processor (no lock on Mempool)
post-validation and insertion (write lock on Mempool)
validation and insertion of unorphaned transactions (see below)

Validate and insert unorphaned transactions

Following any insert of a transaction into the mempool, some orphan transactions may be unorphaned. Here again we take advantage of the new design. On insertions, room is made on the fly if necessary and then we keep looping as long as new transactions are unorphaned (both new behaviors).

parallel validation of batches of transactions (no lock)
post-validation and insertion (write lock)
in case of new unorphaned transactions, loop to step 1 (new behavior)

Validate and insert transactions in batch

This function is involved in the transaction relay flow, where a bunch of transactions gets broadcasted to peers. The batch is processed in a topological order by level of chained dependency (new behavior). For each level:

pre-validation (read lock)
parallel validation of the sub-batch of transactions by the virtual processor (no lock)
post-validation and insertion (write lock)
validation and insertion of unorphaned transactions

Handle the transactions of a block newly added to the DAG

Here again, a split in 3 steps adds granularity in the locks.

handling of transactions (write lock),
validation and insertion of unorphaned transactions
expiring the low level transactions (write lock)

Revalidate high priority transactions

This is probably the most important bottleneck alleviation, together with tx relay broadcasting. This process occurs only every 30 seconds and for a node getting a very high rate of local transactions, it was implying a very computer intensive processing all under a single atomic lock of the mempool.

This gets redesigned as:

getting all the high priority transactions (read lock)
processing the batch in a topological order by level of chained dependency, instead of a "standard" Kahn's in-degree algorithm (new behavior)
populating all transactions with UTXO entries found in the mempool (read lock)
parallel validation of the sub-batch of transactions by the virtual processor (no lock)
updating the transactions in the mempool and removing the invalid ones, along with all their redeemers (write lock)

Since the process is now much more granular in how it locks both the mempool and the consensus, it may happen that some high priority transactions present at the beginning got removed (mined in a block, invalidated, etc.) during the (relatively long) process. The transaction id may or may not be returned as accepted depending on the sequence of events. But no impact is expected on the node. It will simply try to rebroadcast a transaction id that is no longer present and the peers requesting it will eventually get a TransactionNotFound, which may and does occur anyway for other reasons.

The behavior changes compared to the golang version in that on transaction validation error the error is simply logged and the execution continues whereas in golang the execution halts returning the error.

Performance gains

Cautious note: an upcoming PR will add a benchmarking infrastructure making it possible to measure the gains in performance, hence revealing to which extent this new design actually alleviates the bottleneck issue.

Depending on the measures, some further adjustments might be needed.

… txs nor parent txs of the new transaction

…rphaned transactions (wip)

…s while revalidating high priority transactions

…r rejected

…event reentrance in mempool, broadcasting to and asking from peers

mining/src/lib.rs

mining/src/mempool/handle_new_block_transactions.rs

mining/src/monitor.rs

…orphan case (accepted txs will usually be orphan)

protocol/flows/src/flowcontext/transactions.rs

…apsulation logic since collections can be completely modified externally; while in tx pools it is important to make sure various internal collections are maintained consistently (for instance the `ready_transactions` field on `TransactionsPool` needs careful maintenance)

…ns in the case `remove_redeemers=false`. This is already done via `remove_transaction_from_sets` -> `transaction_pool.remove_transaction`. + a few minor changes

…s happens as part of: `remove_from_transaction_pool_and_update_orphans` -> `orphan_pool.update_orphans_after_transaction_removed` -> `orphan_pool.remove_redeemers_of`

… score after collection (bug fix)

* Split mempool atomic validate and insert transaction in 3 steps * Process tx relay flow received txs in batch * Use a single blocking task per MiningManagerProxy fn * Split parallel txs validation in chunks of max block mass * Abstract expire_low_priority_transactions into Pool trait * Making room in the mempool for a new transaction won't remove chained txs nor parent txs of the new transaction * Refine lock granularity on Mempool and Consensus while processing unorphaned transactions (wip) * Fix failing test * Enhance performance & refine lock granularity on Mempool and Consensus while revalidating high priority transactions * Comments * Fix upper bound of transactions chunk * Ensure a chunk has at least 1 tx * Prevent add twice the same tx to the mempool * Clear transaction entries before revalidation * Add some logs and comments * Add logs to debug transactions removals * On accepted block do not remove orphan tx redeemers * Add 2 TODOs * Fix a bug of high priority transactions being unexpectedly orphaned or rejected * Refactor transaction removal reason into an enum * Add an accepted transaction ids cache to the mempool and use it to prevent reentrance in mempool, broadcasting to and asking from peers * Improve the filtering of unknown transactions in tx relay * Enhance tx removal logging * Add mempool stats * Process new and unorphaned blocks in topological order * Run revalidation of HP txs in a dedicated task * Some profiling and debug logs * Run expiration of LP txs in a dedicated task * remove some stopwatch calls which were timing locks * crucial: fix exploding complexity of `handle_new_block_transactions`/`remove_transaction` * fixes in `on_new_block` * refactor block template cache into `Inner` * make `block_template_cache` a non-blocking call (never blocks) * Log build_block_template retries * While revalidating HP txs, only recheck transaction entries * Fix accepted count during revalidation * mempool bmk: use client pools + various improvements * Improve the topological sorting of transactions * Return transaction descendants BFS ordered + some optimizations * Group expiration and revalidation of mempool txs in one task * Refine the schedule of the cleaning task * ignore perf logs * maintain mempool ready transactions in a dedicated set * Bound the returned candidate transactions to a maximum * Reduces the max execution time of build block template * lint * Add mempool lock granularity to get_all_transactions * Restore block template cache lifetime & make it customizable in devnet-prealloc feature * Restore block template cache lifetime & make it customizable in devnet-prealloc feature * Relax a bit the BBT maximum attempts constraint * Refactor multiple `contained_by_txs` fns into one generic * Test selector transaction rejects & fix empty template returned by `select_transactions` upon selector reuse * Log some mempool metrics * Handle new block and then new block template * turn tx selector into an ongoing process with persistent state (wip: some tests are broken; selector is not used correctly by builder) * use tx selector for BBT (wip: virtual processor retry logic) * virtual processor selector retry logic * make BBT fallible by some selector criteria + comments and some docs * add an infallible mode to virtual processor `build_block_template()` * constants for tx selector successful decision * Add e-tps to logged mempool metrics * avoid realloc * Address review comments * Use number of ready txs in e-tps & enhance mempool lock * Ignore failing send for clean tokio shutdown * Log double spends * Log tx script cache stats (wip) * Ease atomic lock ordering & enhance counter updates * Enhance tx throughput stats log line * More robust management of cached data life cycle * Log mempool sampled instead of exact lengths * avoid passing consensus to orphan pool * rename ro `validate_transaction_unacceptance` and move to before the orphan case (accepted txs will usually be orphan) * rename `cleaning` -> `mempool_scanning` * keep intervals aligned using a round-up formula (rather than a loop) * design fix: avoid exposing full collections as mut. This violates encapsulation logic since collections can be completely modified externally; while in tx pools it is important to make sure various internal collections are maintained consistently (for instance the `ready_transactions` field on `TransactionsPool` needs careful maintenance) * minor: close all pool receivers on op error * `remove_transaction`: no need to manually update parent-child relations in the case `remove_redeemers=false`. This is already done via `remove_transaction_from_sets` -> `transaction_pool.remove_transaction`. + a few minor changes * encapsulate `remove_transaction_utxos` into `transaction_pool` * no need to `remove_redeemers_of` for the initial removed tx since this happens as part of: `remove_from_transaction_pool_and_update_orphans` -> `orphan_pool.update_orphans_after_transaction_removed` -> `orphan_pool.remove_redeemers_of` * inline `remove_from_transaction_pool_and_update_orphans` * remove redeemers of expired low-prio txs + register scan time and daa score after collection (bug fix) * change mempool monitor logs to debug * make tps logging more accurate * import bmk improvements from mempool-perf-stats branch * make `config.block_template_cache_lifetime` non-feature dependent --------- Co-authored-by: Michael Sutton <[email protected]>

tiram88 added 26 commits July 16, 2023 13:26

Split mempool atomic validate and insert transaction in 3 steps

5981bd5

Merge branch 'master' into mempool-perf

846bd38

Process tx relay flow received txs in batch

1b18e11

Use a single blocking task per MiningManagerProxy fn

84c1743

Merge branch 'master' into mempool-perf

a2c8e5b

Split parallel txs validation in chunks of max block mass

563baaf

Merge branch 'master' into mempool-perf

d842040

Merge branch 'master' into mempool-perf

4f3d1de

Abstract expire_low_priority_transactions into Pool trait

49d74eb

Merge branch 'master' into mempool-perf

4b36529

Merge branch 'master' into mempool-perf

2b4d41d

Making room in the mempool for a new transaction won't remove chained…

0cb6619

… txs nor parent txs of the new transaction

Refine lock granularity on Mempool and Consensus while processing uno…

6ebde53

…rphaned transactions (wip)

Fix failing test

ff59ac9

Enhance performance & refine lock granularity on Mempool and Consensu…

974a980

…s while revalidating high priority transactions

Merge branch 'master' into mempool-perf

b725080

Comments

b814876

Merge branch 'master' into mempool-perf

68e341d

Fix upper bound of transactions chunk

bd13f22

Ensure a chunk has at least 1 tx

eab8c8e

Prevent add twice the same tx to the mempool

f7b58d2

Clear transaction entries before revalidation

13b4e43

Add some logs and comments

9026043

Add logs to debug transactions removals

82cb48a

On accepted block do not remove orphan tx redeemers

77af21d

Add 2 TODOs

41397ba

michaelsutton mentioned this pull request Sep 6, 2023

Benchmark for measuring various performance aspects under high mempool pressure #275

Merged

tiram88 added 3 commits September 6, 2023 16:26

Fix a bug of high priority transactions being unexpectedly orphaned o…

54e8b39

…r rejected

Refactor transaction removal reason into an enum

f6225aa

Add an accepted transaction ids cache to the mempool and use it to pr…

ece7e97

…event reentrance in mempool, broadcasting to and asking from peers

tiram88 added 2 commits September 22, 2023 15:50

Ignore failing send for clean tokio shutdown

66dbca8

Log double spends

abd3143

michaelsutton reviewed Sep 24, 2023

View reviewed changes

tiram88 and others added 7 commits September 24, 2023 15:01

Log tx script cache stats (wip)

c29680f

Ease atomic lock ordering & enhance counter updates

6f8b024

Enhance tx throughput stats log line

87f20f0

More robust management of cached data life cycle

bc6bdc6

Log mempool sampled instead of exact lengths

1c7357d

avoid passing consensus to orphan pool

fd4731e

rename ro validate_transaction_unacceptance and move to before the …

8bec918

…orphan case (accepted txs will usually be orphan)

michaelsutton reviewed Oct 3, 2023

View reviewed changes

protocol/flows/src/flowcontext/transactions.rs Outdated Show resolved Hide resolved

michaelsutton added 16 commits October 5, 2023 15:59

rename cleaning -> mempool_scanning

a99d3d8

keep intervals aligned using a round-up formula (rather than a loop)

f161b1c

minor: close all pool receivers on op error

316f9d1

remove_transaction: no need to manually update parent-child relatio…

8ce99e7

…ns in the case `remove_redeemers=false`. This is already done via `remove_transaction_from_sets` -> `transaction_pool.remove_transaction`. + a few minor changes

encapsulate remove_transaction_utxos into transaction_pool

4faf2fa

no need to remove_redeemers_of for the initial removed tx since thi…

7be8b61

…s happens as part of: `remove_from_transaction_pool_and_update_orphans` -> `orphan_pool.update_orphans_after_transaction_removed` -> `orphan_pool.remove_redeemers_of`

inline remove_from_transaction_pool_and_update_orphans

e6142e7

remove redeemers of expired low-prio txs + register scan time and daa…

044fa3d

… score after collection (bug fix)

change mempool monitor logs to debug

969fae4

Merge branch 'master' into mempool-perf

d01ebd6

Merge branch 'master' into mempool-perf

acc3016

make tps logging more accurate

2e11cf6

Merge branch 'master' into mempool-perf

cc51b62

import bmk improvements from mempool-perf-stats branch

5aff480

make config.block_template_cache_lifetime non-feature dependent

099aa40

michaelsutton approved these changes Oct 6, 2023

View reviewed changes

michaelsutton merged commit a59214e into kaspanet:master Oct 6, 2023
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance Mempool performance #226

Enhance Mempool performance #226

tiram88 commented Jul 18, 2023 •

edited

Loading

Enhance Mempool performance #226

Enhance Mempool performance #226

Conversation

tiram88 commented Jul 18, 2023 • edited Loading

Validate and insert a single transaction

Validate and insert unorphaned transactions

Validate and insert transactions in batch

Handle the transactions of a block newly added to the DAG

Revalidate high priority transactions

Performance gains

tiram88 commented Jul 18, 2023 •

edited

Loading