Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to write custom metadata via configuration since version 0.9.0 #1353

Closed
torshind opened this issue May 10, 2023 · 2 comments · Fixed by #1365
Closed

Unable to write custom metadata via configuration since version 0.9.0 #1353

torshind opened this issue May 10, 2023 · 2 comments · Fixed by #1365
Labels
bug Something isn't working

Comments

@torshind
Copy link

Environment

Delta-rs version: 0.9.0

Binding: python

Environment:

  • OS: Ubuntu 22.04.2 WSL2
  • Other: Python 3.10.6

Bug

What happened: with version 0.9.0 a custom field isn't written anymore in metadata

What you expected to happen: read a custom metadata field without errors

How to reproduce it: write and read a table using configuration={"configTest": "foobar"} like in the unit tests of the previous version

@torshind torshind added the bug Something isn't working label May 10, 2023
@roeap
Copy link
Collaborator

roeap commented May 14, 2023

Thanks for reporting @torshind.

Had a look and this is in fact a regression caused by us switching to a new implementation for the create command, which only honors "official" delta configuration. After reading the delta protocol a bit more is seems that it is completely feasible to write custom data into the table config, even if readers / writers do not know what to do with this.

That said, a more common pattern - at least from what I have seen - is to include custom metadata with the commit information an not with the global table metadata. As it used to be the custom configuration would also only be included if a new table is created, and not if passed in subsequent writes to the table.

I guess my questions are:

  • would you expect to be able to update custom config on the table metadata also after creating the table (it is possible now on the rust side, but not python)
  • would adding metadata to the commit info on individual commits also serve your use-case?

@torshind
Copy link
Author

torshind commented May 15, 2023

Hello, thanks for you answer.

  • Yes, I think updating the metadata after creating the table is a useful thing;
    one use case I can think of is a tag that depends on the current state of the table data.
  • if you're referring to the userMetadata field I found in the standard, it's less useful for us at the moment, but I can't tell now if we might need it later, we're still in the design phase.

wjones127 pushed a commit that referenced this issue May 17, 2023
# Description

When switching to the create operation to back `write_deltalake` we
allowed only known configuration keys on metadata. THis fixed that
regression.

# Related Issue(s)

closes #1353
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants