Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-104: [FORMAT] Add alignment and padding requirements + union clarification #67

Closed
wants to merge 2 commits into from

Conversation

emkornfield
Copy link
Contributor

I believe this change captures the discussion we had on the mailing list about alignment and padding for arrays. It also captures the update to UnionArrays. The rendered version should be viewable here: https:/emkornfield/arrow/blob/emk_format_changes/format/Layout.md

@emkornfield emkornfield changed the title ARROW-104: Add alignment and padding requirements + union clarification ARROW-104: [FORMAT] Add alignment and padding requirements + union clarification Apr 23, 2016
@@ -10,6 +10,8 @@ concepts, here is a small glossary to help disambiguate.
* Contiguous memory region: a sequential virtual address space with a given
length. Any byte can be reached via a single pointer offset less than the
region's length.
* Contiguous memory buffer: A contiguous memory region that stores
a multi-value component of an Array. Sometimes referred to as just "buffer".
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this read better as "a variable length component of an Array".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think either is okay. Could give as an example an array of integers of some type (e.g. signed int8 or signed int32) and length

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave as is. Hopefully the rest of the document serves as an example.

@wesm
Copy link
Member

wesm commented Apr 23, 2016

This all seems reasonable to me. @toddlipcon, @jacques-n, @Ippokratis7, @bigdata-memory, and others, would you give a look when you are able?

@emkornfield
Copy link
Contributor Author

Any additional feedback on this?

@wesm
Copy link
Member

wesm commented Apr 30, 2016

Since this reflects the discussion on the mailing list, I'll signing off on this as the default alignment, and we can revisit if there are some lingering concerns. Should other minimum alignments be needed we can most likely address that in the metadata. +1, thank you

@asfgit asfgit closed this in 56514d9 Apr 30, 2016
praveenbingo pushed a commit to praveenbingo/arrow that referenced this pull request Aug 30, 2018
Support isnull, isnotnull, equal, and not_equal for date/time types
Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to
Implement all extractXxx functions
praveenbingo pushed a commit to praveenbingo/arrow that referenced this pull request Aug 30, 2018
Support isnull, isnotnull, equal, and not_equal for date/time types
Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to
Implement all extractXxx functions
praveenbingo pushed a commit to praveenbingo/arrow that referenced this pull request Aug 30, 2018
Support isnull, isnotnull, equal, and not_equal for date/time types
Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to
Implement all extractXxx functions
praveenbingo pushed a commit to praveenbingo/arrow that referenced this pull request Aug 30, 2018
Support isnull, isnotnull, equal, and not_equal for date/time types
Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to
Implement all extractXxx functions
wesm added a commit to wesm/arrow that referenced this pull request Sep 2, 2018
I was also able to remove the `-Wno-unused-value` compiler flag. Removing `-Wno-unused-variable` will have to take place in another patch (more work required).

Author: Wes McKinney <[email protected]>

Closes apache#67 from wesm/PARQUET-463 and squashes the following commits:

da3afb2 [Wes McKinney] Fix signed-unsigned comparisons inside dchecks
a1ca479 [Wes McKinney] Remove -Wno-unused-value
0b49cc6 [Wes McKinney] Adapt simple dcheck macros from Kudu, fix dcheck failures
praveenbingo pushed a commit to praveenbingo/arrow that referenced this pull request Sep 4, 2018
Support isnull, isnotnull, equal, and not_equal for date/time types
Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to
Implement all extractXxx functions
wesm added a commit to wesm/arrow that referenced this pull request Sep 4, 2018
I was also able to remove the `-Wno-unused-value` compiler flag. Removing `-Wno-unused-variable` will have to take place in another patch (more work required).

Author: Wes McKinney <[email protected]>

Closes apache#67 from wesm/PARQUET-463 and squashes the following commits:

da3afb2 [Wes McKinney] Fix signed-unsigned comparisons inside dchecks
a1ca479 [Wes McKinney] Remove -Wno-unused-value
0b49cc6 [Wes McKinney] Adapt simple dcheck macros from Kudu, fix dcheck failures

Change-Id: Ia735bfc97f1641984f9925f662c828ab270f0596
wesm added a commit to wesm/arrow that referenced this pull request Sep 6, 2018
I was also able to remove the `-Wno-unused-value` compiler flag. Removing `-Wno-unused-variable` will have to take place in another patch (more work required).

Author: Wes McKinney <[email protected]>

Closes apache#67 from wesm/PARQUET-463 and squashes the following commits:

da3afb2 [Wes McKinney] Fix signed-unsigned comparisons inside dchecks
a1ca479 [Wes McKinney] Remove -Wno-unused-value
0b49cc6 [Wes McKinney] Adapt simple dcheck macros from Kudu, fix dcheck failures

Change-Id: Ia735bfc97f1641984f9925f662c828ab270f0596
wesm added a commit to wesm/arrow that referenced this pull request Sep 7, 2018
I was also able to remove the `-Wno-unused-value` compiler flag. Removing `-Wno-unused-variable` will have to take place in another patch (more work required).

Author: Wes McKinney <[email protected]>

Closes apache#67 from wesm/PARQUET-463 and squashes the following commits:

da3afb2 [Wes McKinney] Fix signed-unsigned comparisons inside dchecks
a1ca479 [Wes McKinney] Remove -Wno-unused-value
0b49cc6 [Wes McKinney] Adapt simple dcheck macros from Kudu, fix dcheck failures

Change-Id: Ia735bfc97f1641984f9925f662c828ab270f0596
wesm added a commit to wesm/arrow that referenced this pull request Sep 8, 2018
I was also able to remove the `-Wno-unused-value` compiler flag. Removing `-Wno-unused-variable` will have to take place in another patch (more work required).

Author: Wes McKinney <[email protected]>

Closes apache#67 from wesm/PARQUET-463 and squashes the following commits:

da3afb2 [Wes McKinney] Fix signed-unsigned comparisons inside dchecks
a1ca479 [Wes McKinney] Remove -Wno-unused-value
0b49cc6 [Wes McKinney] Adapt simple dcheck macros from Kudu, fix dcheck failures

Change-Id: Ia735bfc97f1641984f9925f662c828ab270f0596
praveenbingo pushed a commit to praveenbingo/arrow that referenced this pull request Sep 10, 2018
Support isnull, isnotnull, equal, and not_equal for date/time types
Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to
Implement all extractXxx functions
praveenbingo pushed a commit to praveenbingo/arrow that referenced this pull request Sep 10, 2018
Support isnull, isnotnull, equal, and not_equal for date/time types
Support date/time types for less_than, less_than_or_equal_to, greater_than, greater_than_or_equal_to
Implement all extractXxx functions
@emkornfield emkornfield deleted the emk_format_changes branch February 26, 2021 05:14
zhouyuan added a commit to zhouyuan/arrow that referenced this pull request Jan 6, 2022
* Support casting boolean to bigint (apache#60)

* remove log4j as it's not used (apache#61)

Signed-off-by: Yuan Zhou <[email protected]>

* Add stripe iteration support for batch_size reading in the ORC Scanner (apache#63)

* Install re2 headers (apache#66)

Co-authored-by: PHILO-HE <[email protected]>
Co-authored-by: zhixingheyi-tian <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants