Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SelectOpt] Add handling for not conditions. #92517

Merged
merged 3 commits into from
May 22, 2024

Conversation

davemgreen
Copy link
Collaborator

This patch attempts to help the SelectOpt pass detect select groups made up of conditions and not(conditions). Usually these are canonicalized in instcombine to remove the not and invert the true/false values, but this will not happen for Loginal operations, which can be beneficial to convert if they are part of a larger select group. The handling for not's are mostly handled in the SelectLike, which can be marked as Inverted in order to reverse the TrueValue and FalseValue.

This helps fix a regression in fortran minloc constructs, after #84628 helped simplify a loop with branches into a loop with selects.

@llvmbot
Copy link
Collaborator

llvmbot commented May 17, 2024

@llvm/pr-subscribers-backend-aarch64

Author: David Green (davemgreen)

Changes

This patch attempts to help the SelectOpt pass detect select groups made up of conditions and not(conditions). Usually these are canonicalized in instcombine to remove the not and invert the true/false values, but this will not happen for Loginal operations, which can be beneficial to convert if they are part of a larger select group. The handling for not's are mostly handled in the SelectLike, which can be marked as Inverted in order to reverse the TrueValue and FalseValue.

This helps fix a regression in fortran minloc constructs, after #84628 helped simplify a loop with branches into a loop with selects.


Patch is 42.36 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/92517.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectOptimize.cpp (+59-16)
  • (modified) llvm/test/CodeGen/AArch64/selectopt-not.ll (+220-106)
diff --git a/llvm/lib/CodeGen/SelectOptimize.cpp b/llvm/lib/CodeGen/SelectOptimize.cpp
index 2e03ae6aec94d..30befb1a537f3 100644
--- a/llvm/lib/CodeGen/SelectOptimize.cpp
+++ b/llvm/lib/CodeGen/SelectOptimize.cpp
@@ -130,7 +130,11 @@ class SelectOptimizeImpl {
   class SelectLike {
     SelectLike(Instruction *I) : I(I) {}
 
+    /// The select (/or) instruction.
     Instruction *I;
+    /// Whether this select is inverted, "not(cond), FalseVal, TrueVal", as
+    /// opposed to the original condition.
+    bool Inverted = false;
 
   public:
     /// Match a select or select-like instruction, returning a SelectLike.
@@ -153,14 +157,21 @@ class SelectOptimizeImpl {
     bool isValid() { return I; }
     operator bool() { return isValid(); }
 
+    /// Invert the select by inverting the condition and switching the operands.
+    void setInverted() {
+      assert(!Inverted && "Trying to invert and inverted SelectLike");
+      assert(isa<Instruction>(getCondition()) &&
+             cast<Instruction>(getCondition())->getOpcode() == Instruction::Xor);
+      Inverted = true;
+    }
+    bool isInverted() const { return Inverted; }
+
     Instruction *getI() { return I; }
     const Instruction *getI() const { return I; }
 
     Type *getType() const { return I->getType(); }
 
-    /// Return the condition for the SelectLike instruction. For example the
-    /// condition of a select or c in `or(zext(c), x)`
-    Value *getCondition() const {
+    Value *getNonInvertedCondition() const {
       if (auto *Sel = dyn_cast<SelectInst>(I))
         return Sel->getCondition();
       // Or(zext) case
@@ -177,11 +188,22 @@ class SelectOptimizeImpl {
       llvm_unreachable("Unhandled case in getCondition");
     }
 
+    /// Return the condition for the SelectLike instruction. For example the
+    /// condition of a select or c in `or(zext(c), x)`
+    Value *getCondition() const {
+      Value *CC = getNonInvertedCondition();
+      if (Inverted)
+        return cast<Instruction>(CC)->getOperand(0);
+      return CC;
+    }
+
     /// Return the true value for the SelectLike instruction. Note this may not
     /// exist for all SelectLike instructions. For example, for `or(zext(c), x)`
     /// the true value would be `or(x,1)`. As this value does not exist, nullptr
     /// is returned.
-    Value *getTrueValue() const {
+    Value *getTrueValue(bool HonorInverts = true) const {
+      if (Inverted && HonorInverts)
+        return getFalseValue(false);
       if (auto *Sel = dyn_cast<SelectInst>(I))
         return Sel->getTrueValue();
       // Or(zext) case - The true value is Or(X), so return nullptr as the value
@@ -195,7 +217,9 @@ class SelectOptimizeImpl {
     /// Return the false value for the SelectLike instruction. For example the
     /// getFalseValue of a select or `x` in `or(zext(c), x)` (which is
     /// `select(c, x|1, x)`)
-    Value *getFalseValue() const {
+    Value *getFalseValue(bool HonorInverts = true) const {
+      if (Inverted && HonorInverts)
+        return getTrueValue(false);
       if (auto *Sel = dyn_cast<SelectInst>(I))
         return Sel->getFalseValue();
       // Or(zext) case - return the operand which is not the zext.
@@ -216,8 +240,8 @@ class SelectOptimizeImpl {
     /// InstCostMap. This may need to be generated for select-like instructions.
     Scaled64 getTrueOpCost(DenseMap<const Instruction *, CostInfo> &InstCostMap,
                            const TargetTransformInfo *TTI) {
-      if (auto *Sel = dyn_cast<SelectInst>(I))
-        if (auto *I = dyn_cast<Instruction>(Sel->getTrueValue()))
+      if (isa<SelectInst>(I))
+        if (auto *I = dyn_cast<Instruction>(getTrueValue()))
           return InstCostMap.contains(I) ? InstCostMap[I].NonPredCost
                                          : Scaled64::getZero();
 
@@ -242,8 +266,8 @@ class SelectOptimizeImpl {
     Scaled64
     getFalseOpCost(DenseMap<const Instruction *, CostInfo> &InstCostMap,
                    const TargetTransformInfo *TTI) {
-      if (auto *Sel = dyn_cast<SelectInst>(I))
-        if (auto *I = dyn_cast<Instruction>(Sel->getFalseValue()))
+      if (isa<SelectInst>(I))
+        if (auto *I = dyn_cast<Instruction>(getFalseValue()))
           return InstCostMap.contains(I) ? InstCostMap[I].NonPredCost
                                          : Scaled64::getZero();
 
@@ -510,9 +534,10 @@ getTrueOrFalseValue(SelectOptimizeImpl::SelectLike SI, bool isTrue,
   for (SelectInst *DefSI = dyn_cast<SelectInst>(SI.getI());
        DefSI != nullptr && Selects.count(DefSI);
        DefSI = dyn_cast<SelectInst>(V)) {
-    assert(DefSI->getCondition() == SI.getCondition() &&
-           "The condition of DefSI does not match with SI");
-    V = (isTrue ? DefSI->getTrueValue() : DefSI->getFalseValue());
+    if (DefSI->getCondition() == SI.getCondition())
+      V = (isTrue ? DefSI->getTrueValue() : DefSI->getFalseValue());
+    else // Handle inverted SI
+      V = (!isTrue ? DefSI->getTrueValue() : DefSI->getFalseValue());
   }
 
   if (isa<BinaryOperator>(SI.getI())) {
@@ -634,16 +659,17 @@ void SelectOptimizeImpl::convertProfitableSIGroups(SelectGroups &ProfSIGroups) {
 
     // Move any debug/pseudo instructions that were in-between the select
     // group to the newly-created end block.
-    SmallVector<Instruction *, 2> DebugPseudoINS;
+    SmallVector<Instruction *, 2> SinkInstrs;
     auto DIt = SI.getI()->getIterator();
     while (&*DIt != LastSI.getI()) {
       if (DIt->isDebugOrPseudoInst())
-        DebugPseudoINS.push_back(&*DIt);
+        SinkInstrs.push_back(&*DIt);
+      if (match(&*DIt, m_Not(m_Specific(SI.getCondition()))))
+        SinkInstrs.push_back(&*DIt);
       DIt++;
     }
-    for (auto *DI : DebugPseudoINS) {
+    for (auto *DI : SinkInstrs)
       DI->moveBeforePreserving(&*EndBlock->getFirstInsertionPt());
-    }
 
     // Duplicate implementation for DbgRecords, the non-instruction debug-info
     // format. Helper lambda for moving DbgRecords to the end block.
@@ -765,6 +791,13 @@ void SelectOptimizeImpl::collectSelectGroups(BasicBlock &BB,
           ++BBIt;
           continue;
         }
+
+        // Skip not(select, if the not is part of the same select group
+        if (match(NI, m_Not(m_Specific(SI.getCondition())))) {
+          ++BBIt;
+          continue;
+        }
+
         // We only allow selects in the same group, not other select-like
         // instructions.
         if (!isa<SelectInst>(NI))
@@ -773,6 +806,10 @@ void SelectOptimizeImpl::collectSelectGroups(BasicBlock &BB,
         SelectLike NSI = SelectLike::match(NI);
         if (NSI && SI.getCondition() == NSI.getCondition()) {
           SIGroup.push_back(NSI);
+        } else if (NSI && match(NSI.getCondition(),
+                                m_Not(m_Specific(SI.getCondition())))) {
+          NSI.setInverted();
+          SIGroup.push_back(NSI);
         } else
           break;
         ++BBIt;
@@ -783,6 +820,12 @@ void SelectOptimizeImpl::collectSelectGroups(BasicBlock &BB,
       if (!isSelectKindSupported(SI))
         continue;
 
+      LLVM_DEBUG({
+        dbgs() << "New Select group with\n";
+        for (auto SI : SIGroup)
+          dbgs() << "  " << *SI.getI() << "\n";
+      });
+
       SIGroups.push_back(SIGroup);
     }
   }
diff --git a/llvm/test/CodeGen/AArch64/selectopt-not.ll b/llvm/test/CodeGen/AArch64/selectopt-not.ll
index 7a949d11c80d0..a7939d651a2c6 100644
--- a/llvm/test/CodeGen/AArch64/selectopt-not.ll
+++ b/llvm/test/CodeGen/AArch64/selectopt-not.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt -select-optimize -mtriple=aarch64-linux-gnu -mcpu=neoverse-v2 -S < %s | FileCheck %s
+; RUN: opt -select-optimize -mtriple=aarch64-linux-gnu -mcpu=neoverse-v2 -S < %s | FileCheck %s --check-prefixes=CHECK,CHECK-STANDARD
+; RUN: opt -select-optimize -mtriple=aarch64-linux-gnu -mcpu=neoverse-v2 -S -disable-loop-level-heuristics < %s | FileCheck %s --check-prefixes=CHECK,CHECK-FORCED
 
 target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
 target triple = "aarch64"
@@ -29,10 +30,10 @@ define i32 @minloc1(ptr nocapture readonly %0, ptr nocapture readonly %1, ptr no
 ; CHECK-NEXT:    [[TMP21:%.*]] = sub i64 0, [[TMP7]]
 ; CHECK-NEXT:    br label [[DOTPREHEADER35:%.*]]
 ; CHECK:       .preheader35:
-; CHECK-NEXT:    [[TMP22:%.*]] = phi i32 [ 2147483647, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP30:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[TMP23:%.*]] = phi i64 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[IV_N:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[DOT045:%.*]] = phi i1 [ false, [[DOTPREHEADER35_LR_PH]] ], [ [[DOT2:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[DOTLCSSA364144:%.*]] = phi i32 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP29:%.*]], [[DOTPREHEADER35]] ]
+; CHECK-NEXT:    [[TMP22:%.*]] = phi i32 [ 2147483647, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP30:%.*]], [[SELECT_END:%.*]] ]
+; CHECK-NEXT:    [[TMP23:%.*]] = phi i64 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[IV_N:%.*]], [[SELECT_END]] ]
+; CHECK-NEXT:    [[DOT045:%.*]] = phi i1 [ false, [[DOTPREHEADER35_LR_PH]] ], [ [[DOT2:%.*]], [[SELECT_END]] ]
+; CHECK-NEXT:    [[DOTLCSSA364144:%.*]] = phi i32 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP29:%.*]], [[SELECT_END]] ]
 ; CHECK-NEXT:    [[TMP24:%.*]] = mul nsw i64 [[TMP23]], [[TMP11]]
 ; CHECK-NEXT:    [[TMP25:%.*]] = getelementptr i8, ptr [[TMP19]], i64 [[TMP24]]
 ; CHECK-NEXT:    [[TMP26:%.*]] = load i32, ptr [[TMP25]], align 4
@@ -40,15 +41,20 @@ define i32 @minloc1(ptr nocapture readonly %0, ptr nocapture readonly %1, ptr no
 ; CHECK-NEXT:    [[TMP28:%.*]] = icmp sge i32 [[TMP26]], [[TMP22]]
 ; CHECK-NEXT:    [[DOTNOT33:%.*]] = and i1 [[DOT045]], [[TMP28]]
 ; CHECK-NEXT:    [[OR_COND:%.*]] = select i1 [[TMP27]], i1 true, i1 [[DOTNOT33]]
-; CHECK-NEXT:    [[TMP29]] = select i1 [[OR_COND]], i32 [[DOTLCSSA364144]], i32 1
+; CHECK-NEXT:    [[OR_COND_FROZEN:%.*]] = freeze i1 [[OR_COND]]
+; CHECK-NEXT:    br i1 [[OR_COND_FROZEN]], label [[SELECT_END]], label [[SELECT_FALSE:%.*]]
+; CHECK:       select.false:
+; CHECK-NEXT:    br label [[SELECT_END]]
+; CHECK:       select.end:
+; CHECK-NEXT:    [[TMP29]] = phi i32 [ [[DOTLCSSA364144]], [[DOTPREHEADER35]] ], [ 1, [[SELECT_FALSE]] ]
+; CHECK-NEXT:    [[DOT2]] = phi i1 [ [[DOT045]], [[DOTPREHEADER35]] ], [ true, [[SELECT_FALSE]] ]
+; CHECK-NEXT:    [[TMP30]] = phi i32 [ [[TMP22]], [[DOTPREHEADER35]] ], [ [[TMP20]], [[SELECT_FALSE]] ]
 ; CHECK-NEXT:    [[NOT_OR_COND:%.*]] = xor i1 [[OR_COND]], true
-; CHECK-NEXT:    [[DOT2]] = select i1 [[NOT_OR_COND]], i1 true, i1 [[DOT045]]
-; CHECK-NEXT:    [[TMP30]] = select i1 [[OR_COND]], i32 [[TMP22]], i32 [[TMP20]]
 ; CHECK-NEXT:    [[IV_N]] = add nuw nsw i64 [[TMP23]], 1
 ; CHECK-NEXT:    [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_N]], [[TMP9]]
 ; CHECK-NEXT:    br i1 [[EXITCOND_NOT]], label [[DOTPREHEADER]], label [[DOTPREHEADER35]]
 ; CHECK:       .preheader:
-; CHECK-NEXT:    [[DOTLCSSA3641_LCSSA:%.*]] = phi i32 [ 0, [[TMP3:%.*]] ], [ [[TMP29]], [[DOTPREHEADER35]] ]
+; CHECK-NEXT:    [[DOTLCSSA3641_LCSSA:%.*]] = phi i32 [ 0, [[TMP3:%.*]] ], [ [[TMP29]], [[SELECT_END]] ]
 ; CHECK-NEXT:    ret i32 [[DOTLCSSA3641_LCSSA]]
 ;
   %4 = getelementptr i8, ptr %0, i64 40
@@ -101,53 +107,106 @@ define i32 @minloc1(ptr nocapture readonly %0, ptr nocapture readonly %1, ptr no
 }
 
 define i32 @minloc1_otherunusednot(ptr nocapture readonly %0, ptr nocapture readonly %1, ptr nocapture readonly %2) {
-; CHECK-LABEL: @minloc1_otherunusednot(
-; CHECK-NEXT:    [[TMP4:%.*]] = getelementptr i8, ptr [[TMP0:%.*]], i64 40
-; CHECK-NEXT:    [[TMP5:%.*]] = load i64, ptr [[TMP4]], align 8
-; CHECK-NEXT:    [[TMP6:%.*]] = getelementptr i8, ptr [[TMP0]], i64 64
-; CHECK-NEXT:    [[TMP7:%.*]] = load i64, ptr [[TMP6]], align 8
-; CHECK-NEXT:    [[TMP8:%.*]] = getelementptr i8, ptr [[TMP0]], i64 80
-; CHECK-NEXT:    [[TMP9:%.*]] = load i64, ptr [[TMP8]], align 8
-; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr i8, ptr [[TMP0]], i64 88
-; CHECK-NEXT:    [[TMP11:%.*]] = load i64, ptr [[TMP10]], align 8
-; CHECK-NEXT:    [[TMP12:%.*]] = load ptr, ptr [[TMP0]], align 8
-; CHECK-NEXT:    [[TMP13:%.*]] = load i32, ptr [[TMP1:%.*]], align 4
-; CHECK-NEXT:    [[TMP14:%.*]] = sext i32 [[TMP13]] to i64
-; CHECK-NEXT:    [[TMP15:%.*]] = add nsw i64 [[TMP14]], -1
-; CHECK-NEXT:    [[TMP16:%.*]] = mul i64 [[TMP15]], [[TMP5]]
-; CHECK-NEXT:    [[TMP17:%.*]] = getelementptr i8, ptr [[TMP12]], i64 [[TMP16]]
-; CHECK-NEXT:    [[TMP18:%.*]] = shl i64 [[TMP7]], 3
-; CHECK-NEXT:    [[TMP19:%.*]] = getelementptr i8, ptr [[TMP17]], i64 [[TMP18]]
-; CHECK-NEXT:    [[TMP20:%.*]] = load i32, ptr [[TMP2:%.*]], align 4
-; CHECK-NEXT:    [[DOTNOT:%.*]] = icmp slt i64 [[TMP9]], 1
-; CHECK-NEXT:    br i1 [[DOTNOT]], label [[DOTPREHEADER:%.*]], label [[DOTPREHEADER35_LR_PH:%.*]]
-; CHECK:       .preheader35.lr.ph:
-; CHECK-NEXT:    [[TMP21:%.*]] = sub i64 0, [[TMP7]]
-; CHECK-NEXT:    br label [[DOTPREHEADER35:%.*]]
-; CHECK:       .preheader35:
-; CHECK-NEXT:    [[TMP22:%.*]] = phi i32 [ 2147483647, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP30:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[TMP23:%.*]] = phi i64 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[IV_N:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[DOT045:%.*]] = phi i1 [ false, [[DOTPREHEADER35_LR_PH]] ], [ [[DOT2:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[DOTLCSSA364144:%.*]] = phi i32 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP29:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[TMP24:%.*]] = mul nsw i64 [[TMP23]], [[TMP11]]
-; CHECK-NEXT:    [[TMP25:%.*]] = getelementptr i8, ptr [[TMP19]], i64 [[TMP24]]
-; CHECK-NEXT:    [[TMP26:%.*]] = load i32, ptr [[TMP25]], align 4
-; CHECK-NEXT:    [[TMP27:%.*]] = icmp ne i32 [[TMP26]], [[TMP20]]
-; CHECK-NEXT:    [[TMP28:%.*]] = icmp sge i32 [[TMP26]], [[TMP22]]
-; CHECK-NEXT:    [[DOTNOT33:%.*]] = and i1 [[DOT045]], [[TMP28]]
-; CHECK-NEXT:    [[OR_COND:%.*]] = select i1 [[TMP27]], i1 true, i1 [[DOTNOT33]]
-; CHECK-NEXT:    [[TMP29]] = select i1 [[OR_COND]], i32 [[DOTLCSSA364144]], i32 1
-; CHECK-NEXT:    [[DOT2]] = select i1 [[OR_COND]], i1 [[DOT045]], i1 true
-; CHECK-NEXT:    [[NOT_OR_COND:%.*]] = xor i1 [[OR_COND]], true
-; CHECK-NEXT:    [[TMP30]] = select i1 [[OR_COND]], i32 [[TMP22]], i32 [[TMP20]]
-; CHECK-NEXT:    [[IV_N]] = add nuw nsw i64 [[TMP23]], 1
-; CHECK-NEXT:    [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_N]], [[TMP9]]
-; CHECK-NEXT:    br i1 [[EXITCOND_NOT]], label [[DOTPREHEADER]], label [[DOTPREHEADER35]]
-; CHECK:       .preheader:
-; CHECK-NEXT:    [[DOTLCSSA3641_LCSSA:%.*]] = phi i32 [ 0, [[TMP3:%.*]] ], [ [[TMP29]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[P:%.*]] = phi i1 [ false, [[TMP3]] ], [ [[NOT_OR_COND]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[Q:%.*]] = select i1 [[P]], i32 [[DOTLCSSA3641_LCSSA]], i32 1
-; CHECK-NEXT:    ret i32 [[Q]]
+; CHECK-STANDARD-LABEL: @minloc1_otherunusednot(
+; CHECK-STANDARD-NEXT:    [[TMP4:%.*]] = getelementptr i8, ptr [[TMP0:%.*]], i64 40
+; CHECK-STANDARD-NEXT:    [[TMP5:%.*]] = load i64, ptr [[TMP4]], align 8
+; CHECK-STANDARD-NEXT:    [[TMP6:%.*]] = getelementptr i8, ptr [[TMP0]], i64 64
+; CHECK-STANDARD-NEXT:    [[TMP7:%.*]] = load i64, ptr [[TMP6]], align 8
+; CHECK-STANDARD-NEXT:    [[TMP8:%.*]] = getelementptr i8, ptr [[TMP0]], i64 80
+; CHECK-STANDARD-NEXT:    [[TMP9:%.*]] = load i64, ptr [[TMP8]], align 8
+; CHECK-STANDARD-NEXT:    [[TMP10:%.*]] = getelementptr i8, ptr [[TMP0]], i64 88
+; CHECK-STANDARD-NEXT:    [[TMP11:%.*]] = load i64, ptr [[TMP10]], align 8
+; CHECK-STANDARD-NEXT:    [[TMP12:%.*]] = load ptr, ptr [[TMP0]], align 8
+; CHECK-STANDARD-NEXT:    [[TMP13:%.*]] = load i32, ptr [[TMP1:%.*]], align 4
+; CHECK-STANDARD-NEXT:    [[TMP14:%.*]] = sext i32 [[TMP13]] to i64
+; CHECK-STANDARD-NEXT:    [[TMP15:%.*]] = add nsw i64 [[TMP14]], -1
+; CHECK-STANDARD-NEXT:    [[TMP16:%.*]] = mul i64 [[TMP15]], [[TMP5]]
+; CHECK-STANDARD-NEXT:    [[TMP17:%.*]] = getelementptr i8, ptr [[TMP12]], i64 [[TMP16]]
+; CHECK-STANDARD-NEXT:    [[TMP18:%.*]] = shl i64 [[TMP7]], 3
+; CHECK-STANDARD-NEXT:    [[TMP19:%.*]] = getelementptr i8, ptr [[TMP17]], i64 [[TMP18]]
+; CHECK-STANDARD-NEXT:    [[TMP20:%.*]] = load i32, ptr [[TMP2:%.*]], align 4
+; CHECK-STANDARD-NEXT:    [[DOTNOT:%.*]] = icmp slt i64 [[TMP9]], 1
+; CHECK-STANDARD-NEXT:    br i1 [[DOTNOT]], label [[DOTPREHEADER:%.*]], label [[DOTPREHEADER35_LR_PH:%.*]]
+; CHECK-STANDARD:       .preheader35.lr.ph:
+; CHECK-STANDARD-NEXT:    [[TMP21:%.*]] = sub i64 0, [[TMP7]]
+; CHECK-STANDARD-NEXT:    br label [[DOTPREHEADER35:%.*]]
+; CHECK-STANDARD:       .preheader35:
+; CHECK-STANDARD-NEXT:    [[TMP22:%.*]] = phi i32 [ 2147483647, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP30:%.*]], [[DOTPREHEADER35]] ]
+; CHECK-STANDARD-NEXT:    [[TMP23:%.*]] = phi i64 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[IV_N:%.*]], [[DOTPREHEADER35]] ]
+; CHECK-STANDARD-NEXT:    [[DOT045:%.*]] = phi i1 [ false, [[DOTPREHEADER35_LR_PH]] ], [ [[DOT2:%.*]], [[DOTPREHEADER35]] ]
+; CHECK-STANDARD-NEXT:    [[DOTLCSSA364144:%.*]] = phi i32 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP29:%.*]], [[DOTPREHEADER35]] ]
+; CHECK-STANDARD-NEXT:    [[TMP24:%.*]] = mul nsw i64 [[TMP23]], [[TMP11]]
+; CHECK-STANDARD-NEXT:    [[TMP25:%.*]] = getelementptr i8, ptr [[TMP19]], i64 [[TMP24]]
+; CHECK-STANDARD-NEXT:    [[TMP26:%.*]] = load i32, ptr [[TMP25]], align 4
+; CHECK-STANDARD-NEXT:    [[TMP27:%.*]] = icmp ne i32 [[TMP26]], [[TMP20]]
+; CHECK-STANDARD-NEXT:    [[TMP28:%.*]] = icmp sge i32 [[TMP26]], [[TMP22]]
+; CHECK-STANDARD-NEXT:    [[DOTNOT33:%.*]] = and i1 [[DOT045]], [[TMP28]]
+; CHECK-STANDARD-NEXT:    [[OR_COND:%.*]] = select i1 [[TMP27]], i1 true, i1 [[DOTNOT33]]
+; CHECK-STANDARD-NEXT:    [[TMP29]] = select i1 [[OR_COND]], i32 [[DOTLCSSA364144]], i32 1
+; CHECK-STANDARD-NEXT:    [[DOT2]] = select i1 [[OR_COND]], i1 [[DOT045]], i1 true
+; CHECK-STANDARD-NEXT:    [[NOT_OR_COND:%.*]] = xor i1 [[OR_COND]], true
+; CHECK-STANDARD-NEXT:    [[TMP30]] = select i1 [[OR_COND]], i32 [[TMP22]], i32 [[TMP20]]
+; CHECK-STANDARD-NEXT:    [[IV_N]] = add nuw nsw i64 [[TMP23]], 1
+; CHECK-STANDARD-NEXT:    [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_N]], [[TMP9]]
+; CHECK-STANDARD-NEXT:    br i1 [[EXITCOND_NOT]], label [[DOTPREHEADER]], label [[DOTPREHEADER35]]
+; CHECK-STANDARD:       .preheader:
+; CHECK-STANDARD-NEXT:    [[DOTLCSSA3641_LCSSA:%.*]] = phi i32 [ 0, [[TMP3:%.*]] ], [ [[TMP29]], [[DOTPREHEADER35]] ]
+; CHECK-STANDARD-NEXT:    [[P:%.*]] = phi i1 [ false, [[TMP3]] ], [ [[NOT_OR_COND]], [[DOTPREHEADER35]] ]
+; CHECK-STANDARD-NEXT:    [[Q:%.*]] = select i1 [[P]], i32 [[DOTLCSSA3641_LCSSA]], i32 1
+; CHECK-STANDARD-NEXT:    ret i32 [[Q]]
+;
+; CHECK-FORCED-LABEL: @minloc1_otherunusednot(
+; CHECK-FORCED-NEXT:    [[TMP4:%.*]] = getelementptr i8, ptr [[TMP0:%.*]], i64 40
+; CHECK-FORCED-NEXT:    [[TMP5:%.*]] = load i64, ptr [[TMP4]], align 8
+; CHECK-FORCED-NEXT:    [[TMP6:%.*]] = getelementptr i8, ptr [[TMP0]], i64 64
+; CHECK-FORCED-NEXT:    [[TMP7:%.*]] = load i64, ptr [[TMP6]], align 8
+; CHECK-FORCED-NEXT:    [[TMP8:%.*]] = getelementptr i8, ptr [[TMP0]], i64 80
+; CHECK-FORCED-NEXT:    [[TMP9:%.*]] = load i64, ptr [[TMP8]], align 8
+; CHECK-FORCED-NEXT:    [[TMP10:%.*]] = getelementptr i8, ptr [[TMP0]], i64 88
+; CHECK-FORCED-NEXT:    [[TMP11:%.*]] = load i64, ptr [[TMP10]], align 8
+; CHECK-FORCED-NEXT:    [[TMP12:%.*]] = load ptr, ptr [[TMP0]], align 8
+; CHECK-FORCED-NEXT:    [[TMP13:%.*]] = load i32, ptr [[TMP1:%.*]], align 4
+; CHECK-FORCED-NEXT:    [[TMP14:%.*]] = sext i32 [[TMP13]] to i64
+; CHECK-FORCED-NEXT:    [[TMP15:%.*]] = add nsw i64 [[TMP14]], -1
+; CHECK-FORCED-NEXT:    [[TMP16:%.*]] = mul i64 [[TMP15]], [[TMP5]]
+; CHECK-FORCED-NEXT:    [[TMP17:%.*]] = getelementptr i8, ptr [[TMP12]], i64 [[TMP16]]
+; CHECK-FORCED-NEXT:    [[TMP18:%.*]] = shl i64 [[TMP7]], 3
+; CHECK-FORCED-NEXT:    [[TMP19:%.*]] = getelementptr i8, ptr [[TMP17]], i64 [[TMP18]]
+; CHECK-FORCED-NEXT:    [[TMP20:%.*]] = load i32, ptr [[TMP2:%.*]], align 4
+; CHECK-FORCED-NEXT:    [[DOTNOT:%.*]] = icmp slt i64 [[TMP9]], 1
+; CHECK-FORCED-NEXT:    br i1 [[DOTNOT]], label [[DOTPREHEADER:%.*]], label [[DOTPREHEADER35_LR_PH:%.*]]
+; CHECK-FORCED:       .preheader35.lr.ph:
+; CHECK-FORCED-NEXT:    [[TMP21:%.*]] = sub i64 0, [[TMP7]]
+; CHECK-FORCED-NEXT:    br label [[DOTPREHEADER35:%.*]]
+; CHECK-FORCED:       .preheader35:
+; CHECK-FORCED-NEXT:    [[TMP22:%.*]] = phi i32 [ 2147483647, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP30:%.*]], [[SE...
[truncated]

Copy link

github-actions bot commented May 17, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@sjoerdmeijer sjoerdmeijer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually these are canonicalized in instcombine to remove the not and invert the true/false values, but this will not happen for Loginal operations

Just wanted to quickly check the reason instcombine isn't doing canonicalisations for these cases. I guess it's because the limited scope that instcombine has, so it won't trigger, is that right?

llvm/lib/CodeGen/SelectOptimize.cpp Outdated Show resolved Hide resolved
llvm/lib/CodeGen/SelectOptimize.cpp Outdated Show resolved Hide resolved
llvm/lib/CodeGen/SelectOptimize.cpp Outdated Show resolved Hide resolved
@davemgreen
Copy link
Collaborator Author

Thanks

Just wanted to quickly check the reason instcombine isn't doing canonicalisations for these cases. I guess it's because the limited scope that instcombine has, so it won't trigger, is that right?

It will be because select x, true, y is treated as or x, y (but poison blocking), and it does not want to break that canonical form.

Copy link
Collaborator

@sjoerdmeijer sjoerdmeijer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers, LGTM

Copy link
Contributor

@sapostolakis sapostolakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I only had two minor comments.

@@ -153,14 +157,22 @@ class SelectOptimizeImpl {
bool isValid() { return I; }
operator bool() { return isValid(); }

/// Invert the select by inverting the condition and switching the operands.
void setInverted() {
assert(!Inverted && "Trying to invert and inverted SelectLike");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invert and inverted --> invert an inverted

llvm/lib/CodeGen/SelectOptimize.cpp Show resolved Hide resolved
This patch attempts to help the SelectOpt pass detect select groups made up of
conditions and not(conditions). Usually these are canonicalized in instcombine
to remove the not and invert the true/false values, but this will not happen
for Loginal operations, which can be beneficial to convert if they are part of
a larger select group. The handling for not's are mostly handled in the
SelectLike, which can be marked as Inverted in order to reverse the TrueValue
and FalseValue.

This helps fix a regression in fortran minloc constructs, after llvm#84628 helped
simplify a loop with branches into a loop with selects.
@davemgreen
Copy link
Collaborator Author

Thanks

@davemgreen davemgreen merged commit 0a62a99 into llvm:main May 22, 2024
3 of 4 checks passed
@davemgreen davemgreen deleted the gh-selectopt-notcond branch May 22, 2024 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants