Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc++] Speed up set_intersection() by fast-forwarding over ranges of non-matching elements with one-sided binary search. #75230

Merged
merged 62 commits into from
Jul 18, 2024

Commits on Oct 17, 2023

  1. [libc++][test] Add lower_bound complexity validation tests prior to i…

    …ntroducing one-sided binary search for non-random iterators.
    ichaer committed Oct 17, 2023
    Configuration menu
    Copy the full SHA
    b65415f View commit details
    Browse the repository at this point in the history
  2. [libc++] Introduce one-sided binary search for lower_bound on non-ran…

    …dom iterators.
    
    One-sided binary search, aka meta binary search, has been in the public domain for decades, and has the general
    advantage of being Ω(1) rather than the classic algorithm's Ω(log(n)), with the downside of executing at most
    2*log(n) comparisons vs the classic algorithm's exact log(n). There are two scenarios in which it really shines:
    the first one is when operating over non-random iterators, because the classic algorithm requires knowing the
    container's size upfront, which adds Ω(n) iterator increments to the complexity. The second one is when you're
    traversing the container in order, trying to fast-forward to the next value: in that case, the classic algorithm
    would yield Ω(n*log(n)) comparisons and, for non-random iterators, Ω(n^2) iterator increments, whereas the one-sided
    version will yield O(n) operations on both counts, with a Ω(log(n)) bound on the number of comparisons.
    ichaer committed Oct 17, 2023
    Configuration menu
    Copy the full SHA
    f6bcf27 View commit details
    Browse the repository at this point in the history
  3. [libc++][test] Add set_intersection complexity validation tests prior…

    … to introducing use of one-sided binary search to fast-forward over ranges of elements.
    ichaer committed Oct 17, 2023
    Configuration menu
    Copy the full SHA
    36bb63e View commit details
    Browse the repository at this point in the history
  4. [libc++] Introduce use of __lower_bound_onesided to improve average c…

    …omplexity of set_intersection.
    ichaer committed Oct 17, 2023
    Configuration menu
    Copy the full SHA
    c23272c View commit details
    Browse the repository at this point in the history

Commits on Jan 2, 2024

  1. Fix constexpr annotations.

    ichaer committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    0b57ea0 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    08af548 View commit details
    Browse the repository at this point in the history
  3. Review feedback: don't use one-sided lower bound in lower_bound() its…

    …elf since that violates the complexity guarantees from the standard.
    ichaer committed Jan 2, 2024
    Configuration menu
    Copy the full SHA
    7aa3927 View commit details
    Browse the repository at this point in the history

Commits on Jan 5, 2024

  1. Configuration menu
    Copy the full SHA
    c44c2a2 View commit details
    Browse the repository at this point in the history
  2. Formatting fixups.

    ichaer committed Jan 5, 2024
    Configuration menu
    Copy the full SHA
    46cc95f View commit details
    Browse the repository at this point in the history

Commits on Jan 15, 2024

  1. General improvements to benchmark, including simplifying and slimming…

    … it down for faster runs, and including comparison counter.
    ichaer committed Jan 15, 2024
    Configuration menu
    Copy the full SHA
    450f5ce View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d0c5f2b View commit details
    Browse the repository at this point in the history

Commits on Jan 23, 2024

  1. Oops, bad mistake while porting into libc++! `__lower_bound_onesided(…

    …)` must start with `__step==0`, otherwise we can't match the complexity of linear search when continually matching (like a std::set_intersection() of matching containers will).
    ichaer committed Jan 23, 2024
    Configuration menu
    Copy the full SHA
    faa3115 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    995d04b View commit details
    Browse the repository at this point in the history
  3. Add more counters to the set_intersection benchmark, guard them behin…

    …d an environment variable so we can choose to either measure time more accurately or obtain more information.
    
    This led me down an interesting road of validating benchmark results and finding a significant discrepancy in timings between when I run all test cases at once or `--benchmark-filter` them individually.
    ichaer committed Jan 23, 2024
    Configuration menu
    Copy the full SHA
    d568d49 View commit details
    Browse the repository at this point in the history

Commits on Jan 29, 2024

  1. Merge branch 'main' into onesided_lower_bound

    * main: (5908 commits)
      [readtapi] Cleanup printing command line options (llvm#75106)
      [flang] Fix compilation error due to variable no being used (llvm#75210)
      [C API] Add getters and setters for fast-math flags on relevant instructions (llvm#75123)
      [libc++][CI] Tests the no RTTI configuration. (llvm#65518)
      [RemoveDIs] Fold variable into assert, it's only used once. NFC
      [RemoveDI] Handle DPValues in SROA (llvm#74089)
      [AArch64][GlobalISel] Test Pre-Commit for Look into array's element
      [mlir][tensor] Fix bug in `tensor.extract(tensor.from_elements)` folder (llvm#75109)
      [analyzer] Move alpha checker EnumCastOutOfRange to optin (llvm#67157)
      [RemoveDIs] Handle DPValues in replaceDbgDeclare (llvm#73507)
      [SHT_LLVM_BB_ADDR_MAP] Implements PGOAnalysisMap in Object and ObjectYAML with tests.
      [X86][GlobalISel] Add instruction selection for G_SELECT (llvm#70753)
      [AMDGPU] Remove unused function splitScalar64BitAddSub
      [LLVM][DWARF] Add compilation directory and dwo name to TU in dwo section (llvm#74909)
      [clang][Interp] Implement __builtin_ffs (llvm#72988)
      [RemoveDIs] Update ConvertDebugDeclareToDebugValue after llvm#72276 (llvm#73508)
      [libc][NFC] Reuse `FloatProperties` constant instead of creating new ones (llvm#75187)
      [RemoveDIs] Fix removeRedundantDdbgInstrs utils for dbg.declares (llvm#74102)
      Reapply "[RemoveDIs][NFC] Find DPValues using findDbgDeclares  (llvm#73500)"
      [GitHub] Remove author association print from new-prs workflow
      ...
    ichaer committed Jan 29, 2024
    Configuration menu
    Copy the full SHA
    6ba7061 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    76c33ca View commit details
    Browse the repository at this point in the history

Commits on Feb 1, 2024

  1. Revert "Oops, bad tracking of displacement on `stride_counting_iterat…

    …or`"
    
    This reverts commit 995d04b.
    ichaer committed Feb 1, 2024
    Configuration menu
    Copy the full SHA
    bb872e0 View commit details
    Browse the repository at this point in the history
  2. * Fix C++03 compatibility issues.

    * Fix tests I had broken.
    * More tweaks and better comments.
    ichaer committed Feb 1, 2024
    Configuration menu
    Copy the full SHA
    a1cd8ff View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    24d1d5b View commit details
    Browse the repository at this point in the history
  4. Merge remote-tracking branch 'llvm/main' into onesided_lower_bound

    * llvm/main: (500 commits)
      [docs] Add beginner-focused office hours (llvm#80308)
      [mlir][sparse] external entry method wrapper for sparse tensors (llvm#80326)
      [StackSlotColoring] Ignore non-spill objects in RemoveDeadStores. (llvm#80242)
      [libc][stdbit] fix return types (llvm#80337)
      Revert "[RISCV] Refine cost on Min/Max reduction" (llvm#80340)
      [TTI]Add support for strided loads/stores.
      [analyzer][HTMLRewriter] Cache partial rewrite results. (llvm#80220)
      [flang][openacc][openmp] Use #0 from hlfir.declare value when generating bound ops (llvm#80317)
      [AArch64][PAC] Expand blend(reg, imm) operation in aarch64-pauth pass (llvm#74729)
      [SHT_LLVM_BB_ADDR_MAP][llvm-readobj] Implements llvm-readobj handling for PGOAnalysisMap. (llvm#79520)
      [libc] add bazel support for most of unistd (llvm#80078)
      [clang-tidy] Remove enforcement of rule C.48 from cppcoreguidelines-prefer-member-init (llvm#80330)
      [OpenMP] Fix typo (NFC) (llvm#80332)
      [BOLT] Enable re-writing of Linux kernel binary (llvm#80228)
      [BOLT] Adjust section sizes based on file offsets (llvm#80226)
      [libc] fix stdbit include test when not all entrypoints are available (llvm#80323)
      [RISCV][GISel] RegBank select and instruction select for vector G_ADD, G_SUB (llvm#74114)
      [RISCV] Add srmcfg CSR from Ssqosid extension. (llvm#79914)
      [mlir][sparse] add sparsification options to pretty print and debug s… (llvm#80205)
      [RISCV][MC] MC layer support for the experimental zalasr extension (llvm#79911)
      ...
    ichaer committed Feb 1, 2024
    Configuration menu
    Copy the full SHA
    f17fa58 View commit details
    Browse the repository at this point in the history

Commits on Feb 2, 2024

  1. Oops, missed an #include

    ichaer committed Feb 2, 2024
    Configuration menu
    Copy the full SHA
    4b73773 View commit details
    Browse the repository at this point in the history

Commits on Feb 5, 2024

  1. Merge remote-tracking branch 'llvm/main' into onesided_lower_bound

    * llvm/main: (328 commits)
      [Flang][OpenMP] Attempt to make map-types-and-sizes.f90 test more agnostic to other architectures
      [Transforms] Add more cos combinations to SimplifyLibCalls and InstCombine (llvm#79699)
      [workflows] Close issues used for backports once the PR has been created (llvm#80394)
      [RISCV] Add support for RISC-V Pointer Masking (llvm#79929)
      [lldb] Cleanup regex in libcxx formatters (NFC) (llvm#80618)
      [lldb] Remove unused private TypeCategoryMap methods (NFC) (llvm#80602)
      [mlir][sparse] refine sparse assembler strategy (llvm#80521)
      [NFC] Fix typo (llvm#80703)
      Fix broken ARM processor features test (llvm#80717)
      [ValueTracking][NFC] Pass `SimplifyQuery` to `computeKnownFPClass` family (llvm#80657)
      [x86_64][windows][swift] do not use Swift async extended frame for wi… (llvm#80468)
      [X86] addConstantComments - add FP16 MOVSH asm comments support
      [X86] Regenerate some vector constant comments missed in recent patches to improve mask predicate handling in addConstantComments
      [clang][AMDGPU][CUDA] Handle __builtin_printf for device printf (llvm#68515)
      Add some clarification to email check message
      [GitHub][Workflows] Prevent multiple private email comments (temporarily) (llvm#80648)
      [workflows] Use /mnt as the build directory on Linux (llvm#80583)
      [Flang][OpenMP] Initial mapping of Fortran pointers and allocatables for target devices (llvm#71766)
      [AMDGPU] GlobalISel for f8 conversions (llvm#80503)
      [AMDGPU] Fixed byte_sel of v_cvt_f32_bf8/v_cvt_f32_fp8 (llvm#80502)
      ...
    ichaer committed Feb 5, 2024
    Configuration menu
    Copy the full SHA
    65bd9b7 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d0facc5 View commit details
    Browse the repository at this point in the history

Commits on Feb 12, 2024

  1. Configuration menu
    Copy the full SHA
    a12aa37 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    69dba78 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    fe1fe8c View commit details
    Browse the repository at this point in the history
  4. Allow for assertions in comparison count when in hardened mode for co…

    …mplexity validation.
    ichaer committed Feb 12, 2024
    Configuration menu
    Copy the full SHA
    bb2c758 View commit details
    Browse the repository at this point in the history

Commits on Feb 13, 2024

  1. Configuration menu
    Copy the full SHA
    c6b895c View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    31321b9 View commit details
    Browse the repository at this point in the history

Commits on Apr 23, 2024

  1. Merge remote-tracking branch 'llvm/main' into onesided_lower_bound

    * llvm/main: (7814 commits)
      [libc++] Add some private headers to libcxx.imp (llvm#89568)
      [Frontend][OpenMP] Add functions for checking construct type (llvm#87258)
      AtomicExpand: Emit or with constant on RHS
      [VPlan] Ignore incoming values with constant false mask. (llvm#89384)
      [Clang][Parser] Don't always destroy template annotations at the end of a declaration (llvm#89494)
      [X86] getTargetShuffleMask - update to take a SDValue instead of a SDNode. NFC.
      [DAGCombiner] Fix miscompile bug in combineShiftOfShiftedLogic (llvm#89616)
      [DAGCombiner] Pre-commit test case for miscompile bug in combineShiftOfShiftedLogic
      [Flang][OpenMP] Add restriction about subobjects to firstprivate and … (llvm#89608)
      [SelectionDAG] Mark frame index as "aliased" at argument copy elison (llvm#89712)
      Pre-commit reproducer for argument copy elison related bug
      [InstCombine] Fold fcmp into select (llvm#86482)
      Make default initialization explicit
      [clang-tidy] Avoid overflow when dumping unsigned integer values (llvm#85060)
      [ARM][AArch64] Split out processor and feature tablegen defs [NFC] (llvm#88282)
      [TableGen] Fix ReplaceRegAction RTTI Kind
      [VectorCombine] foldShuffleOfShuffles - add missing arguments to getShuffleCost calls.
      [VPlan] Skip extending ICmp results in trunateToMinimalBitwidth.
      [LLVM][CodeGen][SVE] rev(whilelo(a,b)) -> whilehi(b,a). (llvm#88294)
      RenameIndependentSubregs: Add missing sub-range for new IMPLICIT_DEFs (llvm#89050)
      ...
    ichaer committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    6c88549 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    3805e95 View commit details
    Browse the repository at this point in the history
  3. Address feedback about qualifying abort(), added comment to clarify c…

    …hoice of not having a `default` case in `switch`.
    ichaer committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    090df86 View commit details
    Browse the repository at this point in the history
  4. Address comment about broken comment for getVectorOfRandom(): move …

    …the function closer to its point of usage and document what `genCacheUnfriendlyData()` is trying to do in its own comment. `getVectorOfRandom()` has imho a good name which describes all it's meant to achieve, it's `genCacheUnfriendlyData()` that needs explaining.
    ichaer committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    cb92d3c View commit details
    Browse the repository at this point in the history
  5. Oops, forgot to format =/. Still working on the remaining feedback, b…

    …ut it would be good to be sure that we have a good baseline after this big merge from main.
    ichaer committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    f4a6f36 View commit details
    Browse the repository at this point in the history
  6. Address comment about making the benchmark's struct MoveInto into a…

    … function -- make it a lambda, to avoid the explicit template parameter a freestanding function would require.
    ichaer committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    3f9cfec View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    1afb99d View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    613e64a View commit details
    Browse the repository at this point in the history
  9. Rename new sentinel-based _IterOps::advance() to `_IterOps::__advan…

    …ce_to` -- no reason IMO to have a second override if `__advance_to = ranges::advance` in c++20...
    ichaer committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    4588447 View commit details
    Browse the repository at this point in the history
  10. Address feedback about using `iterator_traits<_Iter>::difference_type…

    …` instead of a templated `_Distance` in `_IterOps::__advance_to()`
    ichaer committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    2af9a6f View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    4f05ded View commit details
    Browse the repository at this point in the history

Commits on Apr 24, 2024

  1. Configuration menu
    Copy the full SHA
    161d81c View commit details
    Browse the repository at this point in the history
  2. Address review comments about set_intersection.h: unnecessary namespa…

    …ce qualification, insufficient comments, and direct use of iterator traits.
    ichaer committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    3c9f800 View commit details
    Browse the repository at this point in the history
  3. Address review comment about replacing struct __set_intersector wit…

    …h a function. I think I managed to preserve readability by keeping `__add_output_unless()` as a lambda.
    ichaer committed Apr 24, 2024
    Configuration menu
    Copy the full SHA
    4aa4a82 View commit details
    Browse the repository at this point in the history

Commits on Apr 26, 2024

  1. Make __add_output_unless() a freestanding function, `__set_intersec…

    …tion_add_output_unless()`, because the lambda [tripped the "MacOS with C++03" test run](https://buildkite.com/llvm-project/libcxx-ci/builds/35055#018f173c-f155-4fbb-b6d7-a7aba01cec9e):
    
    ```
    ```
    ichaer committed Apr 26, 2024
    Configuration menu
    Copy the full SHA
    8307b2d View commit details
    Browse the repository at this point in the history

Commits on Apr 27, 2024

  1. Address comment about using std::forward<_Compare>() for consisten…

    …cy in `__set_intersection()` base overload.
    ichaer committed Apr 27, 2024
    Configuration menu
    Copy the full SHA
    be6c5c8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    62a6010 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    e2af5cc View commit details
    Browse the repository at this point in the history

Commits on Apr 28, 2024

  1. Configuration menu
    Copy the full SHA
    89201ea View commit details
    Browse the repository at this point in the history

Commits on Apr 29, 2024

  1. Remove unnecessary PauseTiming()/ResumeTiming() in the benchmark data…

    … generation stage, time won't be measured before we go into the benchmark::State loops.
    ichaer committed Apr 29, 2024
    Configuration menu
    Copy the full SHA
    5f6e7fe View commit details
    Browse the repository at this point in the history

Commits on May 24, 2024

  1. Apply suggestions from code review

    @ldionne's latest inline review suggestions.
    
    Co-authored-by: Louis Dionne <[email protected]>
    ichaer and ldionne authored May 24, 2024
    Configuration menu
    Copy the full SHA
    109e5a4 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    cc95b51 View commit details
    Browse the repository at this point in the history
  3. Move new _IterOps<_ClassicAlgPolicy>::__advance_to() overloads next…

    … to its pre-existing sibling and remove leftover comments.
    ichaer committed May 24, 2024
    Configuration menu
    Copy the full SHA
    91e4e51 View commit details
    Browse the repository at this point in the history

Commits on May 27, 2024

  1. Configuration menu
    Copy the full SHA
    c977bb7 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b4fad5b View commit details
    Browse the repository at this point in the history
  3. change complexity test interface, 's/testSetIntersectionAndReturnOpCo…

    …unts/counted_set_intersection/', and split complexity tests into `set_intersection_complexity.pass.cpp`.
    ichaer committed May 27, 2024
    Configuration menu
    Copy the full SHA
    87f12c2 View commit details
    Browse the repository at this point in the history
  4. Move last complexity test to the new file, and add a matching one for…

    … `std::set_intersection()`
    ichaer committed May 27, 2024
    Configuration menu
    Copy the full SHA
    505c004 View commit details
    Browse the repository at this point in the history
  5. Add yet another complexity test, this one validating the standard gua…

    …rantees for a single match over an input of 1 to 20 elements.
    ichaer committed May 27, 2024
    Configuration menu
    Copy the full SHA
    95b118a View commit details
    Browse the repository at this point in the history

Commits on May 28, 2024

  1. Take into account additional comparison performed in assertion in har…

    …dening mode inside `testComplexityBasic()` as well.
    ichaer committed May 28, 2024
    Configuration menu
    Copy the full SHA
    b1bfa0f View commit details
    Browse the repository at this point in the history
  2. Add release note. It reads a bit awkward, maybe I'll come up with som…

    …ething better after some more thinking...
    ichaer committed May 28, 2024
    Configuration menu
    Copy the full SHA
    f501bdc View commit details
    Browse the repository at this point in the history

Commits on Jul 16, 2024

  1. * s/__(prev_)?advanced/__$1_may_be_equal/g

    * s/__set_intersection_add_output_unless/__set_intersection_add_output_if_equal/
    * Add comments to explain the logic for equality comparison
    ichaer committed Jul 16, 2024
    Configuration menu
    Copy the full SHA
    c5df570 View commit details
    Browse the repository at this point in the history
  2. Oops

    ichaer committed Jul 16, 2024
    Configuration menu
    Copy the full SHA
    6189e95 View commit details
    Browse the repository at this point in the history

Commits on Jul 17, 2024

  1. Merge remote-tracking branch 'llvm/main' into onesided_lower_bound

    * llvm/main: (8718 commits)
      [LLVM] Add `llvm.experimental.vector.compress` intrinsic (llvm#92289)
      [Clang] [C23] Implement N2653: u8 strings are char8_t[] (llvm#97208)
      [SLP][REVEC] Make SLP support revectorization (-slp-revec) and add simple test. (llvm#98269)
      [Flang] Exclude the reference to TIME_UTC for AIX. (llvm#99069)
      [clang][Sema] Improve `Sema::CheckCXXDefaultArguments` (llvm#97338)
      adjust the Xtensa backend after change f270a4d
      [lldb][Bazel]: Second attempt to adapt for a751f65
      Revert "[lldb][Bazel]: Adapt BUILD.bazel file for a751f65"
      [lldb][Bazel]: Adapt BUILD.bazel file for a751f65
      [emacs] Fix autoloading for llvm-mir-mode (llvm#98984)
      [SLP]Improve minbitwidth analysis for trun'ed gather nodes.
      [Clang][Concepts] Avoid substituting into constraints for invalid TemplateDecls (llvm#75697)
      [LV][NFC]Introduce isScalableVectorizationAllowed() to refactor getMaxLegalScalableVF().
      Revert "[AArch64] Remove superfluous sxtw in peephole opt (llvm#96293)"
      [AArch64][GISel] Add test cases for folding shifts into load/store addressing modes (NFC)
      [LV] Process dead interleave pointer ops in reverse order.
      [AMDGPU] Remove SIWholeQuadMode pass early exit (llvm#98450)
      [InstrRef][NFC] Avoid un-necessary DenseMap queries (llvm#99048)
      [gn build] Port e94e72a
      [LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neoverse V2) (llvm#95819)
      ...
    ichaer committed Jul 17, 2024
    Configuration menu
    Copy the full SHA
    6eacf2f View commit details
    Browse the repository at this point in the history