Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Way to check if Delta table exists at specified path #2662

Closed
jorritsandbrink opened this issue Jul 13, 2024 · 5 comments · Fixed by #2715
Closed

Way to check if Delta table exists at specified path #2662

jorritsandbrink opened this issue Jul 13, 2024 · 5 comments · Fixed by #2715
Assignees
Labels
enhancement New feature or request

Comments

@jorritsandbrink
Copy link

jorritsandbrink commented Jul 13, 2024

Description

It would be nice to have a way to check if a Delta table exists at a given path.

Delta Spark has the DeltaTable.isDeltaTable() method: https://docs.delta.io/latest/api/python/spark/index.html#delta.tables.DeltaTable.isDeltaTable

Use Case
When merging into a Delta table, you first define a DeltaTable object before calling merge on it:

dt = DeltaTable("path_to_table_that_may_not_exist_yet")

dt.merge(...)

This fails with a TableNotFoundError if the table does not exist.

Having the isDeltaTable method (or similar) would enable this flow:

if DeltaTable.isDeltaTable("path_to_table_that_may_not_exist_yet"):
    dt = DeltaTable("path_to_table_that_may_not_exist_yet")
    dt.merge(...)
else:
    write_deltalake(...)

The above requires two path lookups, which adds unnecessary latency.

Alternatively, something like this could be nice:

dt = DeltaTable("path_to_table_that_may_not_exist_yet", return_none_if_not_exists=True)

if dt is not None:
    dt.merge(...)
else:
    write_deltalake(...)
@jorritsandbrink jorritsandbrink added the enhancement New feature or request label Jul 13, 2024
@omkar-foss
Copy link
Contributor

omkar-foss commented Jul 15, 2024

Hey @jorritsandbrink, so there's a function called try_get_deltatable() in the code here that provides this functionality.

try_get_deltatable() returns None if the table is not found at the specified path (i.e. TableNotFoundError occurs), otherwise it returns the DeltaTable instance already existing at the specified path. It can be used with your above mentioned flow as follows:

from deltalake.writer import try_get_deltatable

dt = try_get_deltatable("path_to_table_that_may_not_exist_yet")
if dt is not None:
    dt.merge(...)
else:
    write_deltalake(...)

And yes I agree with you, since this is a very common use case, we can consider abstracting it to something more meaningful like DeltaTable.isDeltaTable().

Until then, I suppose we should also consider adding the try_get_deltatable() function to the delta-rs documentation, may help other users looking for this functionality.

@jorritsandbrink
Copy link
Author

@omkar-foss thanks! I'll use try_get_deltatable().

@ion-elgreco
Copy link
Collaborator

Feel free to open a PR for a static method is_deltatable()

@omkar-foss
Copy link
Contributor

take

omkar-foss added a commit to omkar-foss/delta-rs that referenced this issue Jul 29, 2024
This commit adds a static method `is_deltatable(path, opts)` to the
`DeltaTable` class, which returns `True` if delta-rs is able to load
a `DeltaTable` instance from the specified `path` and `False` otherwise.
omkar-foss added a commit to omkar-foss/delta-rs that referenced this issue Jul 29, 2024
This adds a static method is_deltatable(path, opts) to the
DeltaTable class, which returns True if delta-rs is able to load
a DeltaTable instance from the specified path and False otherwise.

Additionally, this also adds documentation of the usage with
examples for the DeltaTable.is_deltatable() method.
@omkar-foss
Copy link
Contributor

Feel free to open a PR for a static method is_deltatable()

Hey @ion-elgreco @jorritsandbrink hope you both are doing great. I've opened PR #2715, please have a look whenever you get a chance. Thanks!

omkar-foss added a commit to omkar-foss/delta-rs that referenced this issue Jul 30, 2024
This adds a static method `is_deltatable(path, opts)` to the
`DeltaTable` Python class, which returns `True` if able to
locate a delta table at specified `path` and `False` otherwise.

It does so by reusing the Rust internal `is_delta_table_location()`
via the `DeltaTableBuilder`.

Additionally, this also adds documentation of the usage with
examples for the `DeltaTable.is_deltatable()` method.
github-merge-queue bot pushed a commit that referenced this issue Jul 31, 2024
This adds a static method `is_deltatable(path, opts)` to the
`DeltaTable` Python class, which returns `True` if able to
locate a delta table at specified `path` and `False` otherwise.

It does so by reusing the Rust internal `is_delta_table_location()`
via the `DeltaTableBuilder`.

Additionally, this also adds documentation of the usage with
examples for the `DeltaTable.is_deltatable()` method.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants