Create real-world test #7

daniellm · 2023-09-01T13:22:33Z

Verkle trees behave different than Hexary Merkle trees in terms of performance (both CPU and RAM).
We need to tune the performance to match our use cases, by choosing appropriate data structures and algorithms.
For that, it would be helpful to use real-world samples of access to the Merkle tree, and translate them to access to the Verkle tree.

Todo:

Take ~10000 consecutive blocks of ethereum (~1 day)
Execute the transactions and contracts
Whenever a transaction/contract depends on state from before our block range, dump that state into a binary file; this will be the "setup"
Whenever a transaction/contract reads or writes data, dump these operations into a binary file; this will be the "execution"
The data format should be translated from the merkle kvp structure to verkle kvp structure
Write this data in a dense binary format and push to repository (should be <50mb, or use less blocks)
The "setup" data can be used to pre-populate the sample tree
- This will provide a real-world sample of how the full tree will look like
- Compute some stats:
  - How many nodes have a single child?
  - What's the median number of children per node?
  - Make a histogram of # of children for internal nodes, as well as values leaf nodes
- This will help understand what RAM optimizations can be done
The "execution" data can be used to:
- Create a realistic benchmark, which we can use to asses how a certain change to the code affects performance
- Profile the code to tune performance

namn-grg · 2023-09-09T08:38:43Z

Summary of discussion done in discord -

This is more of a performance analysis, and since right now we are in the early stage of implementation, this is low priority/ enhancement.
Ignacio (jsign) has done a similar analysis in the past here. The difference is we would like to go one step further and generate files containing the access operations done to the verkle trie, so we can replay it at ease.
Gballet also did something similar in this PR
After talking with Ignacio he found this useful and said that this could be a great quick-"e2e" test. Also he said "For geth tests and optimizations, we've been replaying some chain history which clearly is heavier... but we re interested in testing the whole thing: a) The state migration using the overlay tree method (thus having dual tree access + extra load migrating state), 2) After the transition finishes, keep going with the VKT with the full state (which I think is what you're trying to focus). Of course as you imagine the setup for that is not light enough to be pushed to your gh repo... so I think that above is good."

daniellm added the enhancement New feature or request label Sep 1, 2023

namn-grg assigned namn-grg and unassigned namn-grg Oct 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create real-world test #7

Create real-world test #7

daniellm commented Sep 1, 2023

namn-grg commented Sep 9, 2023

Create real-world test #7

Create real-world test #7

Comments

daniellm commented Sep 1, 2023

namn-grg commented Sep 9, 2023