Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colblk: experiment with potential performance improvements #4022

Open
1 of 7 tasks
RaduBerinde opened this issue Oct 9, 2024 · 0 comments
Open
1 of 7 tasks

colblk: experiment with potential performance improvements #4022

RaduBerinde opened this issue Oct 9, 2024 · 0 comments

Comments

@RaduBerinde
Copy link
Member

RaduBerinde commented Oct 9, 2024

This issue is meant to be a running list of ideas to improve performance in the columnar format.

  • Don't separate suffix into wall time / logical time / untyped version. The hypothesis is that the current comparison code is too complex and a simple bytes.Compare might be faster [radu] Experimented with this and saw 10-30% regressions in CockroachDataBlockIterShort.
  • Add a version-is-not-timestamp column to data blocks and have a fast path for blocks or regions that only have timestamps. This column would be empty for all sql table data.
  • Separate prefix and suffix in index blocks and use compressed prefix encoding. The seeking should be faster and we'd be able to use a single-level index in more cases.
  • Expose a PrevPrefix operation that uses the prefix changed bitmap to quickly jump to the previous key. The expectation is that this would reduce CPU during reverse scans. I believe MVCC GC currently scans in reverse and could benefit.
  • In PrefixBytes.Search try to compare a single byte before calling bytes.Compare.
  • Move block properties into BlockPropertyCollector-defined columns. This can allow more compact encoding of properties (eg, MVCC timestamps may benefit from uint delta encoding). Likely a minor benefit, but we would additionally avoid decoding O(n) varints where n is the number of BlockPropertyCollectors
  • Stash block decoders in the table cache to avoid the initial parsing of the structure of a block on block cache hits

Jira issue: PEBBLE-273

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Next
Development

No branches or pull requests

2 participants