Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow mode casting for Stores #2249

Merged
merged 21 commits into from
Sep 27, 2024
Merged

Conversation

TomAugspurger
Copy link
Contributor

Following the discussion in #2186 (comment), this is a POC for opening "reopening" a store with a new mode.

The semantics are that reopening a store with a new mode should return a new Store that's backed by the same underlying data.

AFAICT, our current stores use / enforce mode "logically". There's nothing about the actual underlying store that's readonly when mode="r". You could imagine some cases where it's not possible to cast between modes (e.g. you have a readonly database connection). I think it's fine for those Stores to handle that.

By default, the method is abstract meaning this is a new bit of code required for stores to implement. We could also raise NotImplementedError by default.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@@ -115,6 +117,11 @@ async def empty(self) -> bool:
else:
return True

def with_mode(self, mode: ZipStoreAccessModeLiteral) -> Self: # type: ignore[override]
return type(self)(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs some work... would we want the new Store to share the same Lock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't straightforward to implement, because of a need to close the old ZipStore (and writing the ending records) before opening the new store.

For now I've opted to raise NotImplementedError for ZipStore.

src/zarr/abc/store.py Outdated Show resolved Hide resolved
@TomAugspurger TomAugspurger marked this pull request as ready for review September 25, 2024 18:18
@TomAugspurger TomAugspurger changed the title [WIP]: Allow mode casting for Stores Allow mode casting for Stores Sep 25, 2024
@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Sep 25, 2024

The GPU test failed. I think that's because at

if store_dict:
self._store_dict = {k: gpu.Buffer.from_buffer(store_dict[k]) for k in iter(store_dict)}
we create a new dictionary where, so that store.with_mode('r') no longer views the same dict as the first store.

@akshaysubr I suspect that those lines are there to ensure that all of the values in the dictionary are GPU buffers. What do you think about documenting that as a requirement to use the store (and maybe narrowing the type of the dict values in __init__ to a gpu.Buffer)?

Copy link
Member

@jhamman jhamman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @TomAugspurger -- a few suggestions but this is great.

cc @d-v-b who was also looking at this problem today.

src/zarr/store/memory.py Outdated Show resolved Hide resolved
src/zarr/testing/store.py Outdated Show resolved Hide resolved
src/zarr/testing/store.py Show resolved Hide resolved
@jhamman jhamman requested a review from d-v-b September 25, 2024 22:52
@jhamman jhamman added this to the 3.0.0 milestone Sep 26, 2024
@d-v-b
Copy link
Contributor

d-v-b commented Sep 26, 2024

Thanks @TomAugspurger -- a few suggestions but this is great.

cc @d-v-b who was also looking at this problem today.

second this, once those suggestions are integrated then I think this will be a great addition.

src/zarr/testing/store.py Outdated Show resolved Hide resolved
@TomAugspurger
Copy link
Contributor Author

It's tangentially related, but I wanted to confirm that people are OK with changing the GpuMemoryStore.__init__ method to not automatically copy the data from an arbitrary Buffer into a gpu.Buffer. I added a from_dict method for people who want that, but I imagine that the common case is a user who knows where the data is and doesn't want to pay the cost of walking that dictionary and potentially moving data in the __init__.

@jhamman
Copy link
Member

jhamman commented Sep 27, 2024

It's tangentially related, but I wanted to confirm that people are OK with changing the GpuMemoryStore.__init__ method to not automatically copy the data from an arbitrary Buffer into a gpu.Buffer. I added a from_dict method for people who want that, but I imagine that the common case is a user who knows where the data is and doesn't want to pay the cost of walking that dictionary and potentially moving data in the __init__.

cc @akshaysubr

Copy link
Member

@jhamman jhamman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me @TomAugspurger - some failing tests in GPU land but otherwise looks good to me.


def __str__(self) -> str:
return f"gpumemory://{id(self._store_dict)}"

def __repr__(self) -> str:
return f"GpuMemoryStore({str(self)!r})"

@classmethod
def from_dict(cls, store_dict: MutableMapping[str, Buffer]) -> Self:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this API!

src/zarr/store/logging.py Outdated Show resolved Hide resolved
src/zarr/store/logging.py Outdated Show resolved Hide resolved
src/zarr/store/logging.py Outdated Show resolved Hide resolved
@jhamman jhamman modified the milestones: 3.0.0, 3.0.0.beta Sep 27, 2024
@jhamman
Copy link
Member

jhamman commented Sep 27, 2024

pre-commit.ci autofix

@jhamman jhamman merged commit 73b884b into zarr-developers:v3 Sep 27, 2024
20 checks passed
@jhamman jhamman added the V3 Affects the v3 branch label Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
V3 Affects the v3 branch
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants