Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce encoded size of map and bump version #331

Merged
merged 2 commits into from
Aug 9, 2023

Conversation

fxamacker
Copy link
Member

@fxamacker fxamacker commented Aug 3, 2023

Updates #292
Updates onflow/flow-go#1744

Changes:

  • bump Atree format version to 1 (was 0) for maps
  • remove redundant version and flag data from root slabs (2 bytes per root slab)
  • remove redundant slab address from children headers in metadata slab since all child slabs have the same address (8 bytes per child header per metadata slab)
  • reduce encoded child slab size in children headers in metadata slab from 4 bytes to 2 bytes (2 bytes per child header per metadata slab)
  • reduce encoded number of digests from 8 bytes to 2 bytes (6 bytes per data slab)
  • reduce encoded number of map elements from 8 bytes to 2 bytes (6 bytes per data slab)
  • decode version 0 and 1
  • encode only version 1
  • add tests to decode map slabs data in version 0

⚠️ This change requires data migration and data format version was bumped from 0 to 1.


  • Targeted PR against main branch
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work
  • Code follows the standards mentioned here
  • Updated relevant documentation
  • Re-reviewed Files changed in the Github PR explorer
  • Added appropriate labels

Changes:
- bump Atree format version to 1 (was 0) for maps
- remove redundant version and flag data from root slabs
  (2 bytes per root slab)
- remove redundant slab address from children headers in metadata slab
  since all child slabs have the same address
  (8 bytes per child header per metadata slab)
- reduce encoded child slab size in children headers in metadata slab
  from 4 bytes to 2 bytes
  (2 bytes per child header per metadata slab)
- reduce encoded number of digests from 8 bytes to 2 bytes
  (6 bytes per data slab)
- reduce encoded number of map elements from 8 bytes to 2 bytes
  (6 bytes per data slab)
- decode version 0 and 1
- encode only version 1
- add tests to decode map slabs data in version 0
@fxamacker fxamacker added improvement storage breaking change breaks existing stored data (requires storage migration) labels Aug 3, 2023
@fxamacker fxamacker self-assigned this Aug 3, 2023
@codecov-commenter
Copy link

codecov-commenter commented Aug 4, 2023

Codecov Report

Merging #331 (cb5d4b0) into main (06f4676) will increase coverage by 0.25%.
Report is 4 commits behind head on main.
The diff coverage is 68.95%.

@@            Coverage Diff             @@
##             main     #331      +/-   ##
==========================================
+ Coverage   65.04%   65.30%   +0.25%     
==========================================
  Files          14       14              
  Lines        8593     8761     +168     
==========================================
+ Hits         5589     5721     +132     
- Misses       2296     2323      +27     
- Partials      708      717       +9     
Files Changed Coverage Δ
map.go 67.44% <68.84%> (+0.48%) ⬆️
map_debug.go 52.45% <100.00%> (ø)

... and 1 file with indirect coverage changes

Copy link
Member

@turbolent turbolent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

map.go Outdated
enc.Scratch[0] = 0x9b
binary.BigEndian.PutUint64(enc.Scratch[1:], uint64(len(e.elems)))
err = enc.CBOR.EncodeRawBytes(enc.Scratch[:9])
enc.Scratch[0] = 0x99
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this magic number here and below? Maybe consider adding a constant with a descriptive name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. 0x99 indicates the start of CBOR array of 2 elements.

Replaced 0x99, 0x83, and 0x59 with consts with descriptive name. Also added more comments in cb5d4b0

}
}

// newMapDataSlabFromDataV0 decodes data in version 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great comments!

// Root DataSlab Header:
//
// +-------------------------------+------------+-------------------------------+
// | slab version + flag (2 bytes) | extra data | slab version + flag (2 bytes) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the the repeated slab version and flag what was removed in the new format?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, new format (v1) only encodes version + flag once at the head (first 2 bytes).

Copy link
Member

@ramtinms ramtinms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@fxamacker fxamacker merged commit 68d0092 into main Aug 9, 2023
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement storage breaking change breaks existing stored data (requires storage migration)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants