Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Execution state] complete ledger - Separation of node types (leaf, intermediate) #1745

Open
Tracked by #1744
ramtinms opened this issue Dec 9, 2021 · 3 comments
Open
Tracked by #1744
Assignees
Labels
Execution Cadence Execution Team Performance Stale Label used when marking an issue stale.

Comments

@ramtinms
Copy link
Member

ramtinms commented Dec 9, 2021

The current implementation of the complete ledger re-uses the same node structure type for both leaf and intermediate nodes. Separating the node types into two structures should improve memory consumption because of the way go handles struct alignment (memory used). For leaf nodes, we don't need the pointers to right and left children and for intermediate nodes, we don't need a pointer for payloads and paths.
The current code has already support for different construction methods for node creation so that should make it easier to separate the node types, though checkpointer right now uses a default constructor and might need some workaround.

Potential challenges

  • [backward compatibility] the updated code should be able to read the checkpoints that were generated by the current version of the code.
@Ullaakut
Copy link

Ullaakut commented Dec 21, 2021

Hi @ramtinms!

Regarding this task

My proposal to implement this task is the following:

The potential drawback with this is that while it improves clarity and maintainability significantly, it comes at a potentially small but non-negligible cost in performance. I could run benchmarks to quantify the impact that this has if you think it is necessary. AFAIK, the overhead is very minimal, but I assume that every thousandth of a nanosecond counts, for such a critical component as the ledger trie.

When it comes to the memory usage savings, we can already compute an approximation of what it would be.
Each branch node would save 32 bytes for the ledger.Path, which is not currently a pointer but always an empty fixed-length 32 byte array; as well as the 8 bytes from the pointer, which is then 40 bytes per branch.
Each node would save 16 bytes for the two unnecessary pointers.

Since in the mainnet checkpoint there are currently ~150 million nodes, about 70 million of which are leaves and 80 are branches (if I recall correctly), that would result in about 1.2GB of savings for the leaves and 3.2GB of savings for branches, for a total of 4.4GB of savings over ~30GB of raw data, so an expected improvement of around 15%.

Is that what you had in mind?

Extension Node proposal

Experimental results show that when using extension nodes with the mainnet-14 checkpoint, the size of the trie after rebuilding it from the checkpoint is reduced by around 30%, going from 14.5GB (without payloads) to 10.3GB. The process of trimming the trie however, which is done by essentially going through all of the leaves of the original trie and recreating a new one from them, turns out to be quite slow. On the machine I use for testing, it takes about 30mns to read the checkpoint and rebuild the nodes, but then it takes 5 to 6 hours to trim the resulting trie.

SharedScreenshot

Once the new trie is ready however, inserting and reading values should be slightly faster, since there are less nodes to traverse to reach leaves. I think it's a tradeoff that might be worth discussing, WDYT?

@fxamacker
Copy link
Member

The potential drawback with this is that while it improves clarity and maintainability significantly, it comes at a potentially small but non-negligible cost in performance. I could run benchmarks to quantify the impact that this has if you think it is necessary.

Hi Brendan, I agree there might be a potential penalty and we probably should run benchmarks.

At a glance, performance of interface function calls vs non-interface function calls appear to be similar if the function isn't inlined. The performance hit is larger when the function can be inlined. Since Go 1.17 can inline some functions that couldn't be inlined with 1.16, maybe both versions should be used in benchmarks.

Very preliminary benchmarks using the example code from the Stack Overflow link showed:

https://go.dev/play/p/E9H_4K2J9-

// when function can be inlined, performance hit on interface is larger
BenchmarkIntmethod-4       	428540371	         2.859 ns/op	       0 B/op	       0 allocs/op
BenchmarkInterface-4       	388212188	         3.089 ns/op	       0 B/op	       0 allocs/op
BenchmarkTypeSwitch-4      	512972532	         2.336 ns/op	       0 B/op	       0 allocs/op
BenchmarkTypeAssertion-4   	495534915	         2.416 ns/op	       0 B/op	       0 allocs/op
// when function cannot be inlined, performance hit on interface is smaller
BenchmarkIntmethod-4       	385235356	         3.088 ns/op	       0 B/op	       0 allocs/op
BenchmarkInterface-4       	388532904	         3.089 ns/op	       0 B/op	       0 allocs/op
BenchmarkTypeSwitch-4      	387217621	         3.088 ns/op	       0 B/op	       0 allocs/op
BenchmarkTypeAssertion-4   	388183740	         3.088 ns/op	       0 B/op	       0 allocs/op

When benchmarks were run with count=20, the not-inlined calls averaged 3.09 ns/op for all 4 benchmarks.

name             old time/op    new time/op    delta
Intmethod-4        2.84ns ± 1%    3.09ns ± 0%   +8.99%  (p=0.000 n=20+19)
Interface-4        3.09ns ± 0%    3.09ns ± 0%   +0.07%  (p=0.045 n=18+20)
TypeSwitch-4       2.33ns ± 0%    3.09ns ± 0%  +32.52%  (p=0.000 n=20+20)
TypeAssertion-4    2.42ns ± 0%    3.09ns ± 0%  +27.44%  (p=0.000 n=19+20)

Benchmarks were run on Haswell CPU and compiled with Go 1.16.

Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale Label used when marking an issue stale. label Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Execution Cadence Execution Team Performance Stale Label used when marking an issue stale.
Projects
None yet
Development

No branches or pull requests

3 participants