Stress Testing Rif Consensus Node’s World State
By Pablo Pedemonte
In 2016 RSK proposed the Unitrie as an alternative to Merkle Patricia Trees for modeling Ethereum’s world state (see Ethereum Yellow Paper, Appendix D). As of January 2020, we have added as an opt-in feature the Unitrie to RSK’s Rif Consensus Node (based on Besu‘s code). Consequently, users can now pick between two world state implementations:
- Besu world state, (or classic world state) as described in the Yellow Paper.
- Unitrie world state, backed up by the Unitrie.
Having the chance to run a Rif Consensus Node node picking from two different world state implementations, we are in a position to compare the merits of each world state variant. In this article we will compare both implementations, analyzing construction/lookup times, and account allocation capacity before exhausting available storage.
Experiment 1: Construction and Lookup Times
For analyzing construction and lookup times we executed a benchmark that consists of adding a given number of accounts to an initially empty Ethereum world state, and committing changes (i.e., storing the created accounts in the world state’s trie/Unitrie) fifty times during benchmark execution. Once all the accounts are committed, we look up for one million accounts in two variants:
- Look up for one million existing accounts (all accounts found).
- Look up for one million random accounts (no account found in practice).
The benchmark finally reports the construction and lookup times. We execute the benchmark on a 4Gb Java heap for an interval ranging from 3M to 10M externally owned accounts. We handle variance by running the benchmark six times for each number of accounts. The following table shows average construction and lookup times for 3M, 6M, and 9M accounts. CT stands for Construction Time, ELT for Existing Accounts Lookup Time, and RLT means Random Accounts Lookup Time. All times are in seconds.
Table 1. Construction and lookup times
Our Unitrie implementation is designed to be memory efficient, in most cases happily trading off execution time for lower memory footprint. This choice makes the Unitrie compare unfavourably with Ethereum’s trie when it comes to lookup times, the Unitrie being 1.2 to 1.8 times slower. But it pays off when considering the number of allocated accounts. The Unitrie can handle 9M accounts in memory, while Besu’s world state implementation barely reaches the 6M barrier. Indeed, constructing a trie with 6M accounts is 1.47 times slower for the classic trie. This shows that Besu is thrashing the JVM to operate at the 6M accounts threshold.
Table 2. Construction and lookup throughputs
Table 2 summarizes construction and lookup throughputs. The acronym CTp is short form for Construction Throughput, and measures the number of updates to the Unitrie/Ethereum’s world state per second. Similarly, ELTp and RLTp stand for Exiting Account Lookup and Random Account Lookup Throughput (lookups per second) respectively. Generally speaking, the Unitrie can perform between 55% – 80% of the operations that Besu’s world state would perform in a given amount of time. Notably and because of trashing, Unitrie’s throughput becomes 1.5 times Besu’s world state throughput when inserting 6M accounts. Also, for that number of accounts, the Unitrie’s lookup throughput for one million random elements matched Besu’s world state.
Note that the lookup throughput is 67K lookups/sec in the worst case (9M accounts, existing accounts lookup). To convince ourselves this is acceptable, consider the BALANCE operation, whose cost is 400 Gas. Given the current gas limit of an RSK block of 6.8M Gas, ideally you can execute at most 6.8M/400 = 17K BALANCE operations in a block, which even at 67K lookups per sec. the Unitrie will perform in 0.25 secs., assuming the trie is stored in memory.
Figures 1, 2, and 3 respectively depict construction and existing/random accounts lookup times when allocating accounts in the [3M, 10M] range. The plots display the central tendency and confidence intervals for each data point. Note that for a 4Gb Java heap, the Unitrie starts trashing at the 9M account threshold (cf., Besu’s world state, trashing at the 6M threshold). In addition, the Unitrie exhibits broader confidence intervals. This can be attributed to the Unitrie implementation storing shared paths as soft references, thus non-deterministically adding path recomputation overhead to run time.
Experiment 2: Contract Accounts
The Unitrie stores each account under a key derived from its address hash. By construction account keys have a random prefix and equal length, so the result will be on average a binary balanced tree with accounts in the leaves. This means that N externally owned accounts will require 2N – 1 nodes.
When considering contract accounts, the Unitrie’s storage capacity will necessarily reduce in relation to previous calculation. This is so because the Unitrie stores code as the right child of the corresponding account node. Therefore for N accounts where α is the ratio between contract
Figure 1. World State Construction Time
Figure 2. Lookup times (Existing Accounts)
Figure 3. Lookup times (Random Accounts)
and externally owned accounts, the Unitrie size becomes (2 + α)N – 1 nodes. This ranges from 2N – 1 nodes when α=0 (thus subsuming the initial calculation for N externally owned accounts) to 3N – 1 nodes when α=1 (all accounts are contracts). Conversely Ethereum’s world state stores account code outside the trie, so the storage capacity of the trie won’t be affected when considering contract accounts.
To measure the Unitrie’s storage capacity variation induced by different values of α, we ran a benchmark allocating as many accounts as possible in a 4Gb Java heap before trashing. We run the experiment for several values of α in the [0,1] interval. The benchmark uses a real world 2114 bytes ERC20 contract for the account code. Figure 4 depicts the results, pinpointing α=0.8 as the point where the Unitrie crosses the maximum storage capacity of Besu Ethereum’s world state. It also shows the Unitrie operational range on a 4Gb heap: it can store anywhere between 9.5M/9M externally owned accounts (α=0) and 4.9M contract accounts (α=1).
Figure 4. Storage capacity variation induced by contract accounts
The Unitrie trades computation time for memory footprint. Figures 2 and 3 show that the Unitrie lookup times are around 1.5 to 2 times slower for accounts in the [3M, 5M] range. However, that difference would vanish in a setting where Besu’s world state ends up reading nodes from disk because it can’t fit enough of the world state in memory. In that case, disk I/O latency would dominate lookup times. Our experiments show that the Unitrie can store in a 4Gb heap a world state consisting of 9M externally owned accounts, while Besu’s world state starts degrading at the 6M accounts limit. Overall, the Unitrie has a greater potential for avoiding disk I/O, at the expense of 1.8X slower lookups (in the worst case) compared to Besu’s world state.
When considering contract accounts, the storage capacity is determined by α, the ratio between contract and externally owned accounts. The lower the value of α, the more nodes that can be fit in memory, again maximizing the chances of avoiding disk I/O latency. Experiment two shows that the Unitrie’s storage capacity stays above Besu’s for α < 0.8.
What’s an estimation for α? As of Nov. 2020 Etherscan reports 80.28M unique accounts on Mainnet. On the same date the number of contract accounts was 12.4M. That gives α=0.154. As of the time of writing, Mainnet has 94.14M unique addresses. Unrealistically assuming that every account created since Nov 2020 was a contract, that gives a grossly overestimated upper bound for α=0.28. Even so, that’s still in the Unitrie’s comfort zone!
Appendix A: Running the Experiments
All of the experiment code is hosted in the Github project rif-consensus-node, and requires JDK 11 or later. After cloning the project, you can run the Unitrie stress test or Ethereum’s classic world state test by going to the project’s root folder and executing:
Where accounts_to_create defines the number of accounts to create, and the parameter alpha_ratio corresponds to the ratio between contract and externally owned accounts defined in Experiment two.
We run our experiments on MacBook Pro with an Intel i5 2.3 GHz dual core processor, 16Gb of RAM and a 250GB SSD. All of the experiments were executed on a 4Gb max size JVM heap, using the G1 garbage collector.