[PERF] Add include metadata to MetadataReader to avoid unncessary payload size #2750

HammadB · 2024-08-30T08:24:17Z

Description of changes

Summarize the changes made by this PR.

Improvements & Bug fixes
- This PR introduces a "include_metadata" field to the MetadataReader across the local and distributed implementations. This way the filter step can only transmit ids, and only final hydration requests all data. This reduces overall payload sizes.
- I chose this pattern to mirror what is done by the VectorReader with the "include_embeddings" flag
- The correct solution for this is push the query down to one node instead of roundtripping.
New functionality
- None

Test plan

How are these changes tested?
I extended the tests for the metadata segment

Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs repository?

…oads

github-actions · 2024-08-30T08:24:32Z

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

Can you think of any use case in which the code does not behave as intended? Have they been tested?
Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
If appropriate, are there adequate property based tests?
If appropriate, are there adequate unit tests?
Should any logging, debugging, tracing information be added or removed?
Are error messages user-friendly?
Have all documentation changes needed been made?
Have all non-obvious changes been commented?

System Compatibility

Are there any potential impacts on other parts of the system or backward compatibility?
Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

…load size (#2750) ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - This PR introduces a "include_metadata" field to the MetadataReader across the local and distributed implementations. This way the filter step can only transmit ids, and only final hydration requests all data. This reduces overall payload sizes. - I chose this pattern to mirror what is done by the VectorReader with the "include_embeddings" flag - The correct solution for this is push the query down to one node instead of roundtripping. - New functionality - None ## Test plan *How are these changes tested?* I extended the tests for the metadata segment - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes *Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs repository](https:/chroma-core/docs)?* --------- Co-authored-by: Sanket Kedia <[email protected]>

[ENH] Add include metadata to MetadataReader to avoid unncessary payl…

028c6cb

…oads

sanketkedia added 2 commits August 30, 2024 10:25

Merge metadata results only hydrates ids if not include metadata

b657218

Fix test

68caf37

sanketkedia merged commit 50459a0 into main Aug 30, 2024
67 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PERF] Add include metadata to MetadataReader to avoid unncessary payload size #2750

[PERF] Add include metadata to MetadataReader to avoid unncessary payload size #2750

HammadB commented Aug 30, 2024

github-actions bot commented Aug 30, 2024

[PERF] Add include metadata to MetadataReader to avoid unncessary payload size #2750

[PERF] Add include metadata to MetadataReader to avoid unncessary payload size #2750

Conversation

HammadB commented Aug 30, 2024

Description of changes

Test plan

Documentation Changes

github-actions bot commented Aug 30, 2024

Reviewer Checklist

Testing, Bugs, Errors, Logs, Documentation

System Compatibility

Quality