-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-39798: [C++] Optimize Take for fixed-size types including nested fixed-size lists #41297
Conversation
eebb793
to
54f295e
Compare
54f295e
to
eee89df
Compare
c29cc3e
to
5831f32
Compare
…nulls to a FixedSizeListBuilder
…locatePrimitiveArrayData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this! My greatest concern is the number of header utilities introduced for handling fixed-width-like. It seems unnecessary to ensure that so many of these are inlined when they don't appear inside a hot loop.
5831f32
to
b4f1b64
Compare
This push was just a rebase. Fixes coming next. |
@bkietz linker errors fixed. |
CI failures are unrelated. I'm going to merge now, but more feedback can be posted here or on the follow-up PRs that use and extend this work. |
After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit 9749d7d. There was 1 benchmark result indicating a performance regression:
The full Conbench report has more details. It also includes information about 41 possible false positives for unstable benchmarks that are known to sometimes produce them. |
@ursabot please benchmark lang=R |
Benchmark runs are scheduled for commit 0034a04. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete. |
Thanks for your patience. Conbench analyzed the 6 benchmarking runs that have been run so far on PR commit 0034a04. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the belated review here.
/// \brief Checks if the given array has a fixed-width type or if it's an array of | ||
/// fixed-size list that can be flattened to an array of fixed-width values. | ||
/// | ||
/// Fixed-width types are the ones defined by the is_fixed_width() predicate in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, it's nice for docstrings to be precise and explanatory, but a 150-lines long docstring ensures that most readers won't read it in entirety. Can you try to distill it down a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that what \brief
is for? It's rather long because of the examples.
Without them, it's really hard to understand the intricacies of this concept.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\brief
is for a one-liner introductory sentence, but if every non-trivial API function gets a 150-line docstring, then nobody will read them anymore (and, besides, even the source code will become impossible to navigate).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove the comments from the ///
and spread the explanation, but I think this concept is complicated enough that having a long comment in the source code is worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed this in #41597.
inline bool IsFixedWidthLike(const ArraySpan& source, bool force_null_count, | ||
ExtraPred extra_predicate) { | ||
const auto* type = source.type; | ||
// BOOL is considered fixed-width if not nested under FIXED_SIZE_LIST. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a rather weird API design choice. Won't it confuse callers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started without BOOL
then almost every IsFixedWidthLike
check in the Take
kernel had to have || type->id() == BOOL
.
Additionally, excluding BOOL
would put this function in conflict with is_fixed_width()
. I wanted to preserve the property that the set of arrays that passes this predicate is a superset of the arrays that can pass is_fixed_width(type->id())
. How could an array be [Is]FixedWidthLike
if it's not even [is_]fixed_width()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My "weird API design choice" comment was about the "if not nested under FSL" part :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API change in #41597.
/// is_fixed_width(*mutable_array->type) SHOULD be true | ||
/// \return The pointer to the fixed-width values of an array or NULLPTR | ||
/// if pre-conditions are not satisfied. | ||
ARROW_EXPORT const uint8_t* OffsetPointerOfFixedWidthValues(const ArraySpan& source); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most if not all API functions operating on arrays take the offset into account, so adding "Offset" to the function name does not bring any additional information. FixedWidthValuesPointer
would be shorter and more consistent with MutableFixedWidthValuesPointer
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inconsistency with MutableFixedWidthValuesPointer
was on purpose because that one requires the input to have offset==0
.
Most if not all API functions operating on arrays take the offset into account...
And the pointers they return can easily have the offset applied to them after the fact. I gave the Offset-
prefix to this one so it works as a reminder to the caller that ptr + arr.offset
won't work (it might by accident), but it's not correct for the general case. The actual offset is a complicated sum of offset products.
@pitrou replied to your comments, please resolve the ones that you think are acceptable and discuss further the ones you still think need to be addressed. Many questions revolve around the handling of |
…t bit-sized types (BOOL) (#41597) ### Rationale for this change Post-merge feedback from #41297. ### What changes are included in this PR? - Supporting `BOOL` as both a top-level and nested in FSL types - Removing the long example from the docstring of `IsFixedWidthLike` These changes don't affect users because this header was added recently and not released. ### Are these changes tested? Yes, by existing and new test cases. * GitHub Issue: #41596 Authored-by: Felipe Oliveira Carvalho <[email protected]> Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
…sted fixed-size lists (apache#41297) ### Rationale for this change Introduce utilities for dealing with fixed-width types (including fixed-size lists of fixed-width types) generically. And use it for initial optimizations of `Take` and `Filter`. ### What changes are included in this PR? - [x] Introduce utilities for dealing with fixed-width types generically - [x] Use faster `Take` kernel on small power-of-2 byte widths of fixed-width types - [x] from `FSLTakeExec` (including FSLs of FSBs) - [x] from `FSBTakeExec` (done before this PR) - [x] ~Take on any fixed-width type~ (as a separate issue apache#41301) - [x] Use faster `Filter` kernel on both primitive and fixed-width types of any length - [x] from `FSLFilterExec` (including FSLs of FSBs) - [x] from `FSBFilterExec` (done before this PR) ### Are these changes tested? By existing and new tests. ### Are there any user-facing changes? Some functions added to the `arrow::util` namespace and documented inline. * GitHub Issue: apache#39798 Authored-by: Felipe Oliveira Carvalho <[email protected]> Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
…support bit-sized types (BOOL) (apache#41597) ### Rationale for this change Post-merge feedback from apache#41297. ### What changes are included in this PR? - Supporting `BOOL` as both a top-level and nested in FSL types - Removing the long example from the docstring of `IsFixedWidthLike` These changes don't affect users because this header was added recently and not released. ### Are these changes tested? Yes, by existing and new test cases. * GitHub Issue: apache#41596 Authored-by: Felipe Oliveira Carvalho <[email protected]> Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
…support bit-sized types (BOOL) (apache#41597) ### Rationale for this change Post-merge feedback from apache#41297. ### What changes are included in this PR? - Supporting `BOOL` as both a top-level and nested in FSL types - Removing the long example from the docstring of `IsFixedWidthLike` These changes don't affect users because this header was added recently and not released. ### Are these changes tested? Yes, by existing and new test cases. * GitHub Issue: apache#41596 Authored-by: Felipe Oliveira Carvalho <[email protected]> Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
Rationale for this change
Introduce utilities for dealing with fixed-width types (including fixed-size lists of fixed-width types) generically. And use it for initial optimizations of
Take
andFilter
.What changes are included in this PR?
Take
kernel on small power-of-2 byte widths of fixed-width typesFSLTakeExec
(including FSLs of FSBs)FSBTakeExec
(done before this PR)Take on any fixed-width type(as a separate issue [C++] Extract the kernel loops used for PrimitiveTakeExec and generalize to any fixed-width type #41301)Filter
kernel on both primitive and fixed-width types of any lengthFSLFilterExec
(including FSLs of FSBs)FSBFilterExec
(done before this PR)Are these changes tested?
By existing and new tests.
Are there any user-facing changes?
Some functions added to the
arrow::util
namespace and documented inline.