Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: cast support fields nested in lists and maps #2541

Merged
merged 1 commit into from
Jun 5, 2024

Conversation

HawaiianSpork
Copy link
Contributor

@HawaiianSpork HawaiianSpork commented May 27, 2024

Description

The current implementation of cast only works for structs nested in structs. This PR adds supports for structs contained in other types (lists and maps). This PR also prevents cast from adding nullable column if the field is not nullable, instead it will throw an error.

Note: This is only a partial solution which would let you merge schema with nested missing columns, it does not allow delta-rs to read the merged schema (though Spark can). To read the merged schema will require another change where delta-rs defines its own datafusion parquet schemaAdapter.

@github-actions github-actions bot added the binding/rust Issues for the Rust crate label May 27, 2024
Copy link

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@HawaiianSpork HawaiianSpork changed the title fix: Cast support fields nested in lists and maps fix: cast support fields nested in lists and maps May 27, 2024
@ion-elgreco
Copy link
Collaborator

Please add also a bunch of python tests

@rtyler
Copy link
Member

rtyler commented May 29, 2024

Please add also a bunch of python tests

@ion-elgreco I'm confused as to what a Python set of tests would do that the tests in Rust don't already do? 😕

@ion-elgreco
Copy link
Collaborator

ion-elgreco commented May 29, 2024

Please add also a bunch of python tests

@ion-elgreco I'm confused as to what a Python set of tests would do that the tests in Rust don't already do? 😕

We mostly need to check the reader side with pyarrow datasets

ion-elgreco
ion-elgreco previously approved these changes Jun 4, 2024
Copy link
Collaborator

@ion-elgreco ion-elgreco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks

@ion-elgreco
Copy link
Collaborator

@HawaiianSpork thanks for the PR! can you rebase please so we can merge

The current implementation of cast only works structs nested in structs.  This PR adds supports for structs nested in other types.  This PR also prevents cast from adding nullable column if the field is not nullable.
@HawaiianSpork
Copy link
Contributor Author

@ion-elgreco thank you. Sorry, I did not get around to writing python tests. The code has been rebased.

@ion-elgreco ion-elgreco enabled auto-merge (squash) June 5, 2024 10:08
@ion-elgreco ion-elgreco merged commit f23c92d into delta-io:main Jun 5, 2024
20 checks passed
@HawaiianSpork HawaiianSpork deleted the cast-with-more-containers branch June 21, 2024 01:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants