Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ContentType as Blob vs Zarr and rename blobDateModified to contentDateModified #220

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

yarikoptic
Copy link
Member

This is just an initial attempt open for discussion. @satra please chime in

I ran into "blobDateModified" in a zarr metadata and it raised my eyebrow since that is not really appropriate and confusing. Hence I decided to look into generalization. I also thought that it would be valuable to make "type" of the content Asset points to explicit, although that could lead to inconsistencies since information is somewhat redundant with encodingFormat and potentially could also be deduced from contenUrl since we have different end points on S3, etc.

Nevertheless I think it might be better to make it explicit. Or at least we have to rename blobDateModified.

  • ContenType name is quite suboptimal since there is a standard HTTP header https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type and thus we could potential confusion.

    But we should keep it a "Type" (not e.g. a Class) to be consistent with other type definitions among models.

    So the other part we could try to vary is "Content". Possible alternatives are "Object", "Data", "Resource"

  • ATM we call all Zarrs just Zarr but it is a "ZarrFolder" really. I wonder if it would be time to start to introduce differentiation here by making it "ZarrFolder", as later we might get "ZarrHDF5" or alike

@yarikoptic yarikoptic requested a review from satra January 30, 2024 15:26
…contentDateModified

This is just an initial attempt open for discussion.

I ran into "blobDateModified" in a zarr metadata and it raised my eyebrow since
that is not really appropriate and confusing. Hence I decided to look into
generalization.  I also thought that it would be valuable to make "type"
of the content Asset points to explicit, although that could lead to
inconsistencies since information is somewhat redundant with encodingFormat and
potentially could also be deduced from contenUrl since we have different end
points on S3, etc.

Nevertheless I think it might be better to make it explicit. Or at least we
have to rename blobDateModified.

- ContenType name is quite suboptimal since there is a standard HTTP header
  https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type
  and thus we could potential confusion.

  But we should keep it a "Type" (not e.g. a Class) to be consistent with
  other type definitions among models.

  So the other part we could try to vary is "Content". Possible alternatives are
  "Object", "Data", "Resource"

- ATM we call all Zarrs just Zarr but it is a "ZarrFolder" really.  I wonder if
  it would be time to start to introduce differentiation here by making it
  "ZarrFolder", as later we might get "ZarrHDF5" or alike
Copy link

codecov bot commented Jan 30, 2024

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (4423b41) 97.66% compared to head (ddd77da) 97.61%.

Files Patch % Lines
dandischema/metadata.py 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #220      +/-   ##
==========================================
- Coverage   97.66%   97.61%   -0.05%     
==========================================
  Files          18       18              
  Lines        1798     1806       +8     
==========================================
+ Hits         1756     1763       +7     
- Misses         42       43       +1     
Flag Coverage Δ
unittests 97.61% <90.00%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@satra
Copy link
Member

satra commented Feb 2, 2024

i'm fine with something like this being in the database, but not sure about changing the metadata model. perhaps we can brainstorm when we meet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants