Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADT] Make set_subtract more efficient when subtrahend is larger (NFC) #99401

Merged
merged 1 commit into from
Jul 17, 2024

Conversation

kazutakahirata
Copy link
Contributor

This patch is based on:

commit fffe272
Author: Teresa Johnson [email protected]
Date: Wed Jul 17 13:53:10 2024 -0700

This iteration comes with a couple of improvements:

  • We now accommodate S2Ty being SmallPtrSet, which has remove_if(pred)
    but not erase(iterator). (Lack of this code path broke the mlir
    build.)

  • The code path for erase(iterator) now pre-increments the iterator to
    avoid problems with iterator invalidation.

This patch is based on:

  commit fffe272
  Author: Teresa Johnson <[email protected]>
  Date:   Wed Jul 17 13:53:10 2024 -0700

This iteration comes with a couple of improvements:

- We now accommodate S2Ty being SmallPtrSet, which has remove_if(pred)
  but not erase(iterator).  (Lack of this code path broke the mlir
  build.)

- The code path for erase(iterator) now pre-increments the iterator to
  avoid problems with iterator invalidation.
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 17, 2024

@llvm/pr-subscribers-llvm-adt

Author: Kazu Hirata (kazutakahirata)

Changes

This patch is based on:

commit fffe272
Author: Teresa Johnson <[email protected]>
Date: Wed Jul 17 13:53:10 2024 -0700

This iteration comes with a couple of improvements:

  • We now accommodate S2Ty being SmallPtrSet, which has remove_if(pred)
    but not erase(iterator). (Lack of this code path broke the mlir
    build.)

  • The code path for erase(iterator) now pre-increments the iterator to
    avoid problems with iterator invalidation.


Full diff: https://github.com/llvm/llvm-project/pull/99401.diff

1 Files Affected:

  • (modified) llvm/include/llvm/ADT/SetOperations.h (+37)
diff --git a/llvm/include/llvm/ADT/SetOperations.h b/llvm/include/llvm/ADT/SetOperations.h
index 1a911b239f4c6..2b1a103565f7d 100644
--- a/llvm/include/llvm/ADT/SetOperations.h
+++ b/llvm/include/llvm/ADT/SetOperations.h
@@ -27,6 +27,15 @@ using check_has_member_remove_if_t =
 template <typename Set, typename Fn>
 static constexpr bool HasMemberRemoveIf =
     is_detected<check_has_member_remove_if_t, Set, Fn>::value;
+
+template <typename Set>
+using check_has_member_erase_iter_t =
+    decltype(std::declval<Set>().erase(std::declval<Set>().begin()));
+
+template <typename Set>
+static constexpr bool HasMemberEraseIter =
+    is_detected<check_has_member_erase_iter_t, Set>::value;
+
 } // namespace detail
 
 /// set_union(A, B) - Compute A := A u B, return whether A changed.
@@ -94,7 +103,35 @@ S1Ty set_difference(const S1Ty &S1, const S2Ty &S2) {
 
 /// set_subtract(A, B) - Compute A := A - B
 ///
+/// Selects the set to iterate based on the relative sizes of A and B for better
+/// efficiency.
+///
 template <class S1Ty, class S2Ty> void set_subtract(S1Ty &S1, const S2Ty &S2) {
+  // If S1 is smaller than S2, iterate on S1 provided that S2 supports efficient
+  // lookups via contains().  Note that a couple callers pass a vector for S2,
+  // which doesn't support contains(), and wouldn't be efficient if it did.
+  using ElemTy = decltype(*S1.begin());
+  if constexpr (detail::HasMemberContains<S2Ty, ElemTy>) {
+    auto Pred = [&S2](const auto &E) { return S2.contains(E); };
+    if constexpr (detail::HasMemberRemoveIf<S1Ty, decltype(Pred)>) {
+      if (S1.size() < S2.size()) {
+        S1.remove_if(Pred);
+        return;
+      }
+    } else if constexpr (detail::HasMemberEraseIter<S1Ty>) {
+      if (S1.size() < S2.size()) {
+        typename S1Ty::iterator Next;
+        for (typename S1Ty::iterator SI = S1.begin(), SE = S1.end(); SI != SE;
+             SI = Next) {
+          Next = std::next(SI);
+          if (S2.contains(*SI))
+            S1.erase(SI);
+        }
+        return;
+      }
+    }
+  }
+
   for (typename S2Ty::const_iterator SI = S2.begin(), SE = S2.end(); SI != SE;
        ++SI)
     S1.erase(*SI);

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@kazutakahirata kazutakahirata merged commit d772cdd into llvm:main Jul 17, 2024
6 of 8 checks passed
@kazutakahirata kazutakahirata deleted the cleanup_set_subtract branch July 17, 2024 23:44
@nikic
Copy link
Contributor

nikic commented Jul 18, 2024

Can you please also add unit tests for this? One where LHS is SmallPtrSet and one where RHS is SmallVector.

Harini0924 pushed a commit to Harini0924/llvm-project that referenced this pull request Jul 22, 2024
llvm#99401)

This patch is based on:

  commit fffe272
  Author: Teresa Johnson <[email protected]>
  Date:   Wed Jul 17 13:53:10 2024 -0700

This iteration comes with a couple of improvements:

- We now accommodate S2Ty being SmallPtrSet, which has remove_if(pred)
  but not erase(iterator).  (Lack of this code path broke the mlir
  build.)

- The code path for erase(iterator) now pre-increments the iterator to
  avoid problems with iterator invalidation.
sgundapa pushed a commit to sgundapa/upstream_effort that referenced this pull request Jul 23, 2024
llvm#99401)

This patch is based on:

  commit fffe272
  Author: Teresa Johnson <[email protected]>
  Date:   Wed Jul 17 13:53:10 2024 -0700

This iteration comes with a couple of improvements:

- We now accommodate S2Ty being SmallPtrSet, which has remove_if(pred)
  but not erase(iterator).  (Lack of this code path broke the mlir
  build.)

- The code path for erase(iterator) now pre-increments the iterator to
  avoid problems with iterator invalidation.
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
#99401)

Summary:
This patch is based on:

  commit fffe272
  Author: Teresa Johnson <[email protected]>
  Date:   Wed Jul 17 13:53:10 2024 -0700

This iteration comes with a couple of improvements:

- We now accommodate S2Ty being SmallPtrSet, which has remove_if(pred)
  but not erase(iterator).  (Lack of this code path broke the mlir
  build.)

- The code path for erase(iterator) now pre-increments the iterator to
  avoid problems with iterator invalidation.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants