Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: return unsupported error for merging schemas in the presence of partition columns #2469

Merged
merged 1 commit into from
May 7, 2024

Conversation

emcake
Copy link
Contributor

@emcake emcake commented Apr 30, 2024

…partition columns

Description

This causes attempts to write to a partitioned table with MergeSchema to fail, as it's not supported by the code.

I took a look at trying to make it work, but there isn't a quick fix. This is because we need a merged schema definition before we start trying to partition by the partition columns, otherwise the newly added columns get dropped. The schema reported for matching in self.arrow_schema_ref also needs to contain the partition columns, and ordering matters in comparing schemas so we need to know the right place to insert them. I think for situtations where flush() has already been called, we also need a function to return a Option<MetaData> action to be applied to manual commits. Finally, re-using a writer in the presence of schema evolution is dangerous, as the original_schema_ref is never updated to match the newly changed schema.

I'd love to follow up with a fix, but in the short term I'd like to just stop others get bitten like I did.

Related Issue(s)

#2468

@github-actions github-actions bot added the binding/rust Issues for the Rust crate label Apr 30, 2024
Copy link

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@rtyler rtyler changed the title fix: Return unsupported error for merging schemas in the presence of … fix: return unsupported error for merging schemas in the presence of partition columns May 1, 2024
crates/core/src/writer/record_batch.rs Outdated Show resolved Hide resolved
crates/core/src/writer/record_batch.rs Outdated Show resolved Hide resolved
@rtyler rtyler self-assigned this May 7, 2024
@rtyler rtyler added this to the Rust v0.18 milestone May 7, 2024
@rtyler rtyler enabled auto-merge (rebase) May 7, 2024 13:50
@rtyler rtyler force-pushed the unsupported-merge-partition-cols branch from a7c62bd to 0ec8a95 Compare May 7, 2024 13:57
@rtyler rtyler merged commit 35664c0 into delta-io:main May 7, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants