-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FlightRPC] Cannot use flight data with DataFusion (Rust) #43552
Labels
Comments
EnricoMi
changed the title
[FlightRPC] Cannot use flight data with datafusion (Rust)
[FlightRPC] Cannot use flight data with DataFusion (Rust)
Aug 5, 2024
This can be fixed in Rust Arrow by copying the data: apache/arrow-rs#6462 and apache/arrow-rs#6471. Ideally, the data would not be misaligned in the first place so the memory can be reused without costly copying. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug, including details regarding any error messages, version, and platform.
Fetching data via Apache Arrow Flight (C++, Python involved) and passing them to Apache DataFusion (Rust) does not work:
This is likely due to #32276 / #36400.
Error:
Reproduce as follows:
git clone --depth=1 https:/apache/arrow.git git clone --depth=1 https:/apache/arrow-testing.git python -m venv venv source venv/bin/activate pip install pyarrow pandas datafusion python arrow/python/examples/flight/server.py RUST_BACKTRACE=1 python example.py arrow-testing/data/csv/aggregate_test_100.csv
with example.py:
The error is thrown in Apache Arrow Rust implementaton: https:/apache/arrow-rs/blob/eddef43d1cb46c1287da187ea1d86b0e1dc35a13/arrow-buffer/src/buffer/scalar.rs#L138
In my environment, depending on the CSV column, e.g.
c1
(c2
) with typei32
(i64
),align
is4
(8
) whilebuffer.as_ptr().align_offset(align)
is always3
(7
), wherearrow-rs
requires this to be0
.Component(s)
FlightRPC
The text was updated successfully, but these errors were encountered: