Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sstable: specialized index blockIter #2592

Closed
jbowens opened this issue Jun 5, 2023 · 3 comments
Closed

sstable: specialized index blockIter #2592

jbowens opened this issue Jun 5, 2023 · 3 comments

Comments

@jbowens
Copy link
Collaborator

jbowens commented Jun 5, 2023

Implementing a specialized blockIter for index blocks should reduce the cpu cost of seeks:

  1. Index blocks are dense. They tend to contain smaller keys (because their keys may be shortened by Comparer.Separator), and relatively small values (an encoded BlockHandle). This density means seeking within an index block has a higher cpu cost, and optimizations here can have an outsized impact on overall seek cpu cost.
  2. The Trailers of keys within index blocks are always ignored. There's no need to decode them.1
  3. The values of keys in index blocks are always inlined, so we can avoid the overhead associated with checking whether a value is inlined or out-of-band and overhead from returning base.LazyValues.
  4. Related to 1, we do not need to perform obsolete bit checks or obsolete key kind transformations introduced in sstable,db: introduce a sstable-internal ObsoleteBit in the key kind #2559 for index block keys.
  5. We write index blocks with a restart interval of 1. Besides encoding the offset of every key, this forces every key to be written in whole without any prefix compression. We can avoid the overhead of considering whether a key might be prefix-compressed.2

The downside is code duplication. But there are a few factors that make this more tolerable:

  1. We use a subset of the blockIter interface with index blocks (eg, no NextPrefix, no SeekLT), so it's not quite doubling the interface surface area.
  2. The non-trivial code surrounding lazy-value handling and reading does not need to be duplicated.
  3. The code around prefix sharing does not need to be duplicated.

Once we have a specialized index block iterator, it's easier to make changes to the index block format. There are two optimizations available right off the bat:

  1. Since the key trailers are unused1, we can omit them entirely.
  2. Since prefix compression is unused2, we can omit the shared bytes varint.

Jira issue: PEBBLE-42

@jbowens
Copy link
Collaborator Author

jbowens commented Dec 25, 2023

Relates to #97

@jbowens
Copy link
Collaborator Author

jbowens commented Dec 25, 2023

We might also consider entirely different formats for index blocks, such as an adaptive radix tree. With sufficient space savings, could we remove support for two-level indexes and simplify the sstable iterator?

Somewhat relates to #2632 by achieving similar key compression within the index block.

@jbowens jbowens self-assigned this Jul 16, 2024
@jbowens
Copy link
Collaborator Author

jbowens commented Jul 16, 2024

Moving into 'Next' category because this will fall out of the columnar block format work.

jbowens added a commit to jbowens/pebble that referenced this issue Aug 16, 2024
Add writer, reader and iterator types for index blocks.

This commit considers and resolves most of cockroachdb#97 and cockroachdb#2592.

Close cockroachdb#97.
Close cockroachdb#2592.
jbowens added a commit to jbowens/pebble that referenced this issue Aug 16, 2024
Add writer, reader and iterator types for index blocks.

This commit considers and resolves most of cockroachdb#97 and cockroachdb#2592.

Close cockroachdb#97.
Close cockroachdb#2592.
jbowens added a commit to jbowens/pebble that referenced this issue Aug 22, 2024
Add writer, reader and iterator types for index blocks.

This commit considers and resolves most of cockroachdb#97 and cockroachdb#2592.

Close cockroachdb#97.
Close cockroachdb#2592.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

1 participant