[SelectOpt] Add handling for not conditions. #92517

davemgreen · 2024-05-17T10:40:02Z

This patch attempts to help the SelectOpt pass detect select groups made up of conditions and not(conditions). Usually these are canonicalized in instcombine to remove the not and invert the true/false values, but this will not happen for Loginal operations, which can be beneficial to convert if they are part of a larger select group. The handling for not's are mostly handled in the SelectLike, which can be marked as Inverted in order to reverse the TrueValue and FalseValue.

This helps fix a regression in fortran minloc constructs, after #84628 helped simplify a loop with branches into a loop with selects.

llvmbot · 2024-05-17T10:40:33Z

@llvm/pr-subscribers-backend-aarch64

Author: David Green (davemgreen)

Changes

This patch attempts to help the SelectOpt pass detect select groups made up of conditions and not(conditions). Usually these are canonicalized in instcombine to remove the not and invert the true/false values, but this will not happen for Loginal operations, which can be beneficial to convert if they are part of a larger select group. The handling for not's are mostly handled in the SelectLike, which can be marked as Inverted in order to reverse the TrueValue and FalseValue.

This helps fix a regression in fortran minloc constructs, after #84628 helped simplify a loop with branches into a loop with selects.

Patch is 42.36 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/92517.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/SelectOptimize.cpp (+59-16)
(modified) llvm/test/CodeGen/AArch64/selectopt-not.ll (+220-106)

diff --git a/llvm/lib/CodeGen/SelectOptimize.cpp b/llvm/lib/CodeGen/SelectOptimize.cpp
index 2e03ae6aec94d..30befb1a537f3 100644
--- a/llvm/lib/CodeGen/SelectOptimize.cpp
+++ b/llvm/lib/CodeGen/SelectOptimize.cpp
@@ -130,7 +130,11 @@ class SelectOptimizeImpl {
   class SelectLike {
     SelectLike(Instruction *I) : I(I) {}
 
+    /// The select (/or) instruction.
     Instruction *I;
+    /// Whether this select is inverted, "not(cond), FalseVal, TrueVal", as
+    /// opposed to the original condition.
+    bool Inverted = false;
 
   public:
     /// Match a select or select-like instruction, returning a SelectLike.
@@ -153,14 +157,21 @@ class SelectOptimizeImpl {
     bool isValid() { return I; }
     operator bool() { return isValid(); }
 
+    /// Invert the select by inverting the condition and switching the operands.
+    void setInverted() {
+      assert(!Inverted && "Trying to invert and inverted SelectLike");
+      assert(isa<Instruction>(getCondition()) &&
+             cast<Instruction>(getCondition())->getOpcode() == Instruction::Xor);
+      Inverted = true;
+    }
+    bool isInverted() const { return Inverted; }
+
     Instruction *getI() { return I; }
     const Instruction *getI() const { return I; }
 
     Type *getType() const { return I->getType(); }
 
-    /// Return the condition for the SelectLike instruction. For example the
-    /// condition of a select or c in `or(zext(c), x)`
-    Value *getCondition() const {
+    Value *getNonInvertedCondition() const {
       if (auto *Sel = dyn_cast<SelectInst>(I))
         return Sel->getCondition();
       // Or(zext) case
@@ -177,11 +188,22 @@ class SelectOptimizeImpl {
       llvm_unreachable("Unhandled case in getCondition");
     }
 
+    /// Return the condition for the SelectLike instruction. For example the
+    /// condition of a select or c in `or(zext(c), x)`
+    Value *getCondition() const {
+      Value *CC = getNonInvertedCondition();
+      if (Inverted)
+        return cast<Instruction>(CC)->getOperand(0);
+      return CC;
+    }
+
     /// Return the true value for the SelectLike instruction. Note this may not
     /// exist for all SelectLike instructions. For example, for `or(zext(c), x)`
     /// the true value would be `or(x,1)`. As this value does not exist, nullptr
     /// is returned.
-    Value *getTrueValue() const {
+    Value *getTrueValue(bool HonorInverts = true) const {
+      if (Inverted && HonorInverts)
+        return getFalseValue(false);
       if (auto *Sel = dyn_cast<SelectInst>(I))
         return Sel->getTrueValue();
       // Or(zext) case - The true value is Or(X), so return nullptr as the value
@@ -195,7 +217,9 @@ class SelectOptimizeImpl {
     /// Return the false value for the SelectLike instruction. For example the
     /// getFalseValue of a select or `x` in `or(zext(c), x)` (which is
     /// `select(c, x|1, x)`)
-    Value *getFalseValue() const {
+    Value *getFalseValue(bool HonorInverts = true) const {
+      if (Inverted && HonorInverts)
+        return getTrueValue(false);
       if (auto *Sel = dyn_cast<SelectInst>(I))
         return Sel->getFalseValue();
       // Or(zext) case - return the operand which is not the zext.
@@ -216,8 +240,8 @@ class SelectOptimizeImpl {
     /// InstCostMap. This may need to be generated for select-like instructions.
     Scaled64 getTrueOpCost(DenseMap<const Instruction *, CostInfo> &InstCostMap,
                            const TargetTransformInfo *TTI) {
-      if (auto *Sel = dyn_cast<SelectInst>(I))
-        if (auto *I = dyn_cast<Instruction>(Sel->getTrueValue()))
+      if (isa<SelectInst>(I))
+        if (auto *I = dyn_cast<Instruction>(getTrueValue()))
           return InstCostMap.contains(I) ? InstCostMap[I].NonPredCost
                                          : Scaled64::getZero();
 
@@ -242,8 +266,8 @@ class SelectOptimizeImpl {
     Scaled64
     getFalseOpCost(DenseMap<const Instruction *, CostInfo> &InstCostMap,
                    const TargetTransformInfo *TTI) {
-      if (auto *Sel = dyn_cast<SelectInst>(I))
-        if (auto *I = dyn_cast<Instruction>(Sel->getFalseValue()))
+      if (isa<SelectInst>(I))
+        if (auto *I = dyn_cast<Instruction>(getFalseValue()))
           return InstCostMap.contains(I) ? InstCostMap[I].NonPredCost
                                          : Scaled64::getZero();
 
@@ -510,9 +534,10 @@ getTrueOrFalseValue(SelectOptimizeImpl::SelectLike SI, bool isTrue,
   for (SelectInst *DefSI = dyn_cast<SelectInst>(SI.getI());
        DefSI != nullptr && Selects.count(DefSI);
        DefSI = dyn_cast<SelectInst>(V)) {
-    assert(DefSI->getCondition() == SI.getCondition() &&
-           "The condition of DefSI does not match with SI");
-    V = (isTrue ? DefSI->getTrueValue() : DefSI->getFalseValue());
+    if (DefSI->getCondition() == SI.getCondition())
+      V = (isTrue ? DefSI->getTrueValue() : DefSI->getFalseValue());
+    else // Handle inverted SI
+      V = (!isTrue ? DefSI->getTrueValue() : DefSI->getFalseValue());
   }
 
   if (isa<BinaryOperator>(SI.getI())) {
@@ -634,16 +659,17 @@ void SelectOptimizeImpl::convertProfitableSIGroups(SelectGroups &ProfSIGroups) {
 
     // Move any debug/pseudo instructions that were in-between the select
     // group to the newly-created end block.
-    SmallVector<Instruction *, 2> DebugPseudoINS;
+    SmallVector<Instruction *, 2> SinkInstrs;
     auto DIt = SI.getI()->getIterator();
     while (&*DIt != LastSI.getI()) {
       if (DIt->isDebugOrPseudoInst())
-        DebugPseudoINS.push_back(&*DIt);
+        SinkInstrs.push_back(&*DIt);
+      if (match(&*DIt, m_Not(m_Specific(SI.getCondition()))))
+        SinkInstrs.push_back(&*DIt);
       DIt++;
     }
-    for (auto *DI : DebugPseudoINS) {
+    for (auto *DI : SinkInstrs)
       DI->moveBeforePreserving(&*EndBlock->getFirstInsertionPt());
-    }
 
     // Duplicate implementation for DbgRecords, the non-instruction debug-info
     // format. Helper lambda for moving DbgRecords to the end block.
@@ -765,6 +791,13 @@ void SelectOptimizeImpl::collectSelectGroups(BasicBlock &BB,
           ++BBIt;
           continue;
         }
+
+        // Skip not(select, if the not is part of the same select group
+        if (match(NI, m_Not(m_Specific(SI.getCondition())))) {
+          ++BBIt;
+          continue;
+        }
+
         // We only allow selects in the same group, not other select-like
         // instructions.
         if (!isa<SelectInst>(NI))
@@ -773,6 +806,10 @@ void SelectOptimizeImpl::collectSelectGroups(BasicBlock &BB,
         SelectLike NSI = SelectLike::match(NI);
         if (NSI && SI.getCondition() == NSI.getCondition()) {
           SIGroup.push_back(NSI);
+        } else if (NSI && match(NSI.getCondition(),
+                                m_Not(m_Specific(SI.getCondition())))) {
+          NSI.setInverted();
+          SIGroup.push_back(NSI);
         } else
           break;
         ++BBIt;
@@ -783,6 +820,12 @@ void SelectOptimizeImpl::collectSelectGroups(BasicBlock &BB,
       if (!isSelectKindSupported(SI))
         continue;
 
+      LLVM_DEBUG({
+        dbgs() << "New Select group with\n";
+        for (auto SI : SIGroup)
+          dbgs() << "  " << *SI.getI() << "\n";
+      });
+
       SIGroups.push_back(SIGroup);
     }
   }
diff --git a/llvm/test/CodeGen/AArch64/selectopt-not.ll b/llvm/test/CodeGen/AArch64/selectopt-not.ll
index 7a949d11c80d0..a7939d651a2c6 100644
--- a/llvm/test/CodeGen/AArch64/selectopt-not.ll
+++ b/llvm/test/CodeGen/AArch64/selectopt-not.ll
@@ -1,5 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt -select-optimize -mtriple=aarch64-linux-gnu -mcpu=neoverse-v2 -S < %s | FileCheck %s
+; RUN: opt -select-optimize -mtriple=aarch64-linux-gnu -mcpu=neoverse-v2 -S < %s | FileCheck %s --check-prefixes=CHECK,CHECK-STANDARD
+; RUN: opt -select-optimize -mtriple=aarch64-linux-gnu -mcpu=neoverse-v2 -S -disable-loop-level-heuristics < %s | FileCheck %s --check-prefixes=CHECK,CHECK-FORCED
 
 target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
 target triple = "aarch64"
@@ -29,10 +30,10 @@ define i32 @minloc1(ptr nocapture readonly %0, ptr nocapture readonly %1, ptr no
 ; CHECK-NEXT:    [[TMP21:%.*]] = sub i64 0, [[TMP7]]
 ; CHECK-NEXT:    br label [[DOTPREHEADER35:%.*]]
 ; CHECK:       .preheader35:
-; CHECK-NEXT:    [[TMP22:%.*]] = phi i32 [ 2147483647, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP30:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[TMP23:%.*]] = phi i64 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[IV_N:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[DOT045:%.*]] = phi i1 [ false, [[DOTPREHEADER35_LR_PH]] ], [ [[DOT2:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[DOTLCSSA364144:%.*]] = phi i32 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP29:%.*]], [[DOTPREHEADER35]] ]
+; CHECK-NEXT:    [[TMP22:%.*]] = phi i32 [ 2147483647, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP30:%.*]], [[SELECT_END:%.*]] ]
+; CHECK-NEXT:    [[TMP23:%.*]] = phi i64 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[IV_N:%.*]], [[SELECT_END]] ]
+; CHECK-NEXT:    [[DOT045:%.*]] = phi i1 [ false, [[DOTPREHEADER35_LR_PH]] ], [ [[DOT2:%.*]], [[SELECT_END]] ]
+; CHECK-NEXT:    [[DOTLCSSA364144:%.*]] = phi i32 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP29:%.*]], [[SELECT_END]] ]
 ; CHECK-NEXT:    [[TMP24:%.*]] = mul nsw i64 [[TMP23]], [[TMP11]]
 ; CHECK-NEXT:    [[TMP25:%.*]] = getelementptr i8, ptr [[TMP19]], i64 [[TMP24]]
 ; CHECK-NEXT:    [[TMP26:%.*]] = load i32, ptr [[TMP25]], align 4
@@ -40,15 +41,20 @@ define i32 @minloc1(ptr nocapture readonly %0, ptr nocapture readonly %1, ptr no
 ; CHECK-NEXT:    [[TMP28:%.*]] = icmp sge i32 [[TMP26]], [[TMP22]]
 ; CHECK-NEXT:    [[DOTNOT33:%.*]] = and i1 [[DOT045]], [[TMP28]]
 ; CHECK-NEXT:    [[OR_COND:%.*]] = select i1 [[TMP27]], i1 true, i1 [[DOTNOT33]]
-; CHECK-NEXT:    [[TMP29]] = select i1 [[OR_COND]], i32 [[DOTLCSSA364144]], i32 1
+; CHECK-NEXT:    [[OR_COND_FROZEN:%.*]] = freeze i1 [[OR_COND]]
+; CHECK-NEXT:    br i1 [[OR_COND_FROZEN]], label [[SELECT_END]], label [[SELECT_FALSE:%.*]]
+; CHECK:       select.false:
+; CHECK-NEXT:    br label [[SELECT_END]]
+; CHECK:       select.end:
+; CHECK-NEXT:    [[TMP29]] = phi i32 [ [[DOTLCSSA364144]], [[DOTPREHEADER35]] ], [ 1, [[SELECT_FALSE]] ]
+; CHECK-NEXT:    [[DOT2]] = phi i1 [ [[DOT045]], [[DOTPREHEADER35]] ], [ true, [[SELECT_FALSE]] ]
+; CHECK-NEXT:    [[TMP30]] = phi i32 [ [[TMP22]], [[DOTPREHEADER35]] ], [ [[TMP20]], [[SELECT_FALSE]] ]
 ; CHECK-NEXT:    [[NOT_OR_COND:%.*]] = xor i1 [[OR_COND]], true
-; CHECK-NEXT:    [[DOT2]] = select i1 [[NOT_OR_COND]], i1 true, i1 [[DOT045]]
-; CHECK-NEXT:    [[TMP30]] = select i1 [[OR_COND]], i32 [[TMP22]], i32 [[TMP20]]
 ; CHECK-NEXT:    [[IV_N]] = add nuw nsw i64 [[TMP23]], 1
 ; CHECK-NEXT:    [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_N]], [[TMP9]]
 ; CHECK-NEXT:    br i1 [[EXITCOND_NOT]], label [[DOTPREHEADER]], label [[DOTPREHEADER35]]
 ; CHECK:       .preheader:
-; CHECK-NEXT:    [[DOTLCSSA3641_LCSSA:%.*]] = phi i32 [ 0, [[TMP3:%.*]] ], [ [[TMP29]], [[DOTPREHEADER35]] ]
+; CHECK-NEXT:    [[DOTLCSSA3641_LCSSA:%.*]] = phi i32 [ 0, [[TMP3:%.*]] ], [ [[TMP29]], [[SELECT_END]] ]
 ; CHECK-NEXT:    ret i32 [[DOTLCSSA3641_LCSSA]]
 ;
   %4 = getelementptr i8, ptr %0, i64 40
@@ -101,53 +107,106 @@ define i32 @minloc1(ptr nocapture readonly %0, ptr nocapture readonly %1, ptr no
 }
 
 define i32 @minloc1_otherunusednot(ptr nocapture readonly %0, ptr nocapture readonly %1, ptr nocapture readonly %2) {
-; CHECK-LABEL: @minloc1_otherunusednot(
-; CHECK-NEXT:    [[TMP4:%.*]] = getelementptr i8, ptr [[TMP0:%.*]], i64 40
-; CHECK-NEXT:    [[TMP5:%.*]] = load i64, ptr [[TMP4]], align 8
-; CHECK-NEXT:    [[TMP6:%.*]] = getelementptr i8, ptr [[TMP0]], i64 64
-; CHECK-NEXT:    [[TMP7:%.*]] = load i64, ptr [[TMP6]], align 8
-; CHECK-NEXT:    [[TMP8:%.*]] = getelementptr i8, ptr [[TMP0]], i64 80
-; CHECK-NEXT:    [[TMP9:%.*]] = load i64, ptr [[TMP8]], align 8
-; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr i8, ptr [[TMP0]], i64 88
-; CHECK-NEXT:    [[TMP11:%.*]] = load i64, ptr [[TMP10]], align 8
-; CHECK-NEXT:    [[TMP12:%.*]] = load ptr, ptr [[TMP0]], align 8
-; CHECK-NEXT:    [[TMP13:%.*]] = load i32, ptr [[TMP1:%.*]], align 4
-; CHECK-NEXT:    [[TMP14:%.*]] = sext i32 [[TMP13]] to i64
-; CHECK-NEXT:    [[TMP15:%.*]] = add nsw i64 [[TMP14]], -1
-; CHECK-NEXT:    [[TMP16:%.*]] = mul i64 [[TMP15]], [[TMP5]]
-; CHECK-NEXT:    [[TMP17:%.*]] = getelementptr i8, ptr [[TMP12]], i64 [[TMP16]]
-; CHECK-NEXT:    [[TMP18:%.*]] = shl i64 [[TMP7]], 3
-; CHECK-NEXT:    [[TMP19:%.*]] = getelementptr i8, ptr [[TMP17]], i64 [[TMP18]]
-; CHECK-NEXT:    [[TMP20:%.*]] = load i32, ptr [[TMP2:%.*]], align 4
-; CHECK-NEXT:    [[DOTNOT:%.*]] = icmp slt i64 [[TMP9]], 1
-; CHECK-NEXT:    br i1 [[DOTNOT]], label [[DOTPREHEADER:%.*]], label [[DOTPREHEADER35_LR_PH:%.*]]
-; CHECK:       .preheader35.lr.ph:
-; CHECK-NEXT:    [[TMP21:%.*]] = sub i64 0, [[TMP7]]
-; CHECK-NEXT:    br label [[DOTPREHEADER35:%.*]]
-; CHECK:       .preheader35:
-; CHECK-NEXT:    [[TMP22:%.*]] = phi i32 [ 2147483647, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP30:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[TMP23:%.*]] = phi i64 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[IV_N:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[DOT045:%.*]] = phi i1 [ false, [[DOTPREHEADER35_LR_PH]] ], [ [[DOT2:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[DOTLCSSA364144:%.*]] = phi i32 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP29:%.*]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[TMP24:%.*]] = mul nsw i64 [[TMP23]], [[TMP11]]
-; CHECK-NEXT:    [[TMP25:%.*]] = getelementptr i8, ptr [[TMP19]], i64 [[TMP24]]
-; CHECK-NEXT:    [[TMP26:%.*]] = load i32, ptr [[TMP25]], align 4
-; CHECK-NEXT:    [[TMP27:%.*]] = icmp ne i32 [[TMP26]], [[TMP20]]
-; CHECK-NEXT:    [[TMP28:%.*]] = icmp sge i32 [[TMP26]], [[TMP22]]
-; CHECK-NEXT:    [[DOTNOT33:%.*]] = and i1 [[DOT045]], [[TMP28]]
-; CHECK-NEXT:    [[OR_COND:%.*]] = select i1 [[TMP27]], i1 true, i1 [[DOTNOT33]]
-; CHECK-NEXT:    [[TMP29]] = select i1 [[OR_COND]], i32 [[DOTLCSSA364144]], i32 1
-; CHECK-NEXT:    [[DOT2]] = select i1 [[OR_COND]], i1 [[DOT045]], i1 true
-; CHECK-NEXT:    [[NOT_OR_COND:%.*]] = xor i1 [[OR_COND]], true
-; CHECK-NEXT:    [[TMP30]] = select i1 [[OR_COND]], i32 [[TMP22]], i32 [[TMP20]]
-; CHECK-NEXT:    [[IV_N]] = add nuw nsw i64 [[TMP23]], 1
-; CHECK-NEXT:    [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_N]], [[TMP9]]
-; CHECK-NEXT:    br i1 [[EXITCOND_NOT]], label [[DOTPREHEADER]], label [[DOTPREHEADER35]]
-; CHECK:       .preheader:
-; CHECK-NEXT:    [[DOTLCSSA3641_LCSSA:%.*]] = phi i32 [ 0, [[TMP3:%.*]] ], [ [[TMP29]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[P:%.*]] = phi i1 [ false, [[TMP3]] ], [ [[NOT_OR_COND]], [[DOTPREHEADER35]] ]
-; CHECK-NEXT:    [[Q:%.*]] = select i1 [[P]], i32 [[DOTLCSSA3641_LCSSA]], i32 1
-; CHECK-NEXT:    ret i32 [[Q]]
+; CHECK-STANDARD-LABEL: @minloc1_otherunusednot(
+; CHECK-STANDARD-NEXT:    [[TMP4:%.*]] = getelementptr i8, ptr [[TMP0:%.*]], i64 40
+; CHECK-STANDARD-NEXT:    [[TMP5:%.*]] = load i64, ptr [[TMP4]], align 8
+; CHECK-STANDARD-NEXT:    [[TMP6:%.*]] = getelementptr i8, ptr [[TMP0]], i64 64
+; CHECK-STANDARD-NEXT:    [[TMP7:%.*]] = load i64, ptr [[TMP6]], align 8
+; CHECK-STANDARD-NEXT:    [[TMP8:%.*]] = getelementptr i8, ptr [[TMP0]], i64 80
+; CHECK-STANDARD-NEXT:    [[TMP9:%.*]] = load i64, ptr [[TMP8]], align 8
+; CHECK-STANDARD-NEXT:    [[TMP10:%.*]] = getelementptr i8, ptr [[TMP0]], i64 88
+; CHECK-STANDARD-NEXT:    [[TMP11:%.*]] = load i64, ptr [[TMP10]], align 8
+; CHECK-STANDARD-NEXT:    [[TMP12:%.*]] = load ptr, ptr [[TMP0]], align 8
+; CHECK-STANDARD-NEXT:    [[TMP13:%.*]] = load i32, ptr [[TMP1:%.*]], align 4
+; CHECK-STANDARD-NEXT:    [[TMP14:%.*]] = sext i32 [[TMP13]] to i64
+; CHECK-STANDARD-NEXT:    [[TMP15:%.*]] = add nsw i64 [[TMP14]], -1
+; CHECK-STANDARD-NEXT:    [[TMP16:%.*]] = mul i64 [[TMP15]], [[TMP5]]
+; CHECK-STANDARD-NEXT:    [[TMP17:%.*]] = getelementptr i8, ptr [[TMP12]], i64 [[TMP16]]
+; CHECK-STANDARD-NEXT:    [[TMP18:%.*]] = shl i64 [[TMP7]], 3
+; CHECK-STANDARD-NEXT:    [[TMP19:%.*]] = getelementptr i8, ptr [[TMP17]], i64 [[TMP18]]
+; CHECK-STANDARD-NEXT:    [[TMP20:%.*]] = load i32, ptr [[TMP2:%.*]], align 4
+; CHECK-STANDARD-NEXT:    [[DOTNOT:%.*]] = icmp slt i64 [[TMP9]], 1
+; CHECK-STANDARD-NEXT:    br i1 [[DOTNOT]], label [[DOTPREHEADER:%.*]], label [[DOTPREHEADER35_LR_PH:%.*]]
+; CHECK-STANDARD:       .preheader35.lr.ph:
+; CHECK-STANDARD-NEXT:    [[TMP21:%.*]] = sub i64 0, [[TMP7]]
+; CHECK-STANDARD-NEXT:    br label [[DOTPREHEADER35:%.*]]
+; CHECK-STANDARD:       .preheader35:
+; CHECK-STANDARD-NEXT:    [[TMP22:%.*]] = phi i32 [ 2147483647, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP30:%.*]], [[DOTPREHEADER35]] ]
+; CHECK-STANDARD-NEXT:    [[TMP23:%.*]] = phi i64 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[IV_N:%.*]], [[DOTPREHEADER35]] ]
+; CHECK-STANDARD-NEXT:    [[DOT045:%.*]] = phi i1 [ false, [[DOTPREHEADER35_LR_PH]] ], [ [[DOT2:%.*]], [[DOTPREHEADER35]] ]
+; CHECK-STANDARD-NEXT:    [[DOTLCSSA364144:%.*]] = phi i32 [ 0, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP29:%.*]], [[DOTPREHEADER35]] ]
+; CHECK-STANDARD-NEXT:    [[TMP24:%.*]] = mul nsw i64 [[TMP23]], [[TMP11]]
+; CHECK-STANDARD-NEXT:    [[TMP25:%.*]] = getelementptr i8, ptr [[TMP19]], i64 [[TMP24]]
+; CHECK-STANDARD-NEXT:    [[TMP26:%.*]] = load i32, ptr [[TMP25]], align 4
+; CHECK-STANDARD-NEXT:    [[TMP27:%.*]] = icmp ne i32 [[TMP26]], [[TMP20]]
+; CHECK-STANDARD-NEXT:    [[TMP28:%.*]] = icmp sge i32 [[TMP26]], [[TMP22]]
+; CHECK-STANDARD-NEXT:    [[DOTNOT33:%.*]] = and i1 [[DOT045]], [[TMP28]]
+; CHECK-STANDARD-NEXT:    [[OR_COND:%.*]] = select i1 [[TMP27]], i1 true, i1 [[DOTNOT33]]
+; CHECK-STANDARD-NEXT:    [[TMP29]] = select i1 [[OR_COND]], i32 [[DOTLCSSA364144]], i32 1
+; CHECK-STANDARD-NEXT:    [[DOT2]] = select i1 [[OR_COND]], i1 [[DOT045]], i1 true
+; CHECK-STANDARD-NEXT:    [[NOT_OR_COND:%.*]] = xor i1 [[OR_COND]], true
+; CHECK-STANDARD-NEXT:    [[TMP30]] = select i1 [[OR_COND]], i32 [[TMP22]], i32 [[TMP20]]
+; CHECK-STANDARD-NEXT:    [[IV_N]] = add nuw nsw i64 [[TMP23]], 1
+; CHECK-STANDARD-NEXT:    [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_N]], [[TMP9]]
+; CHECK-STANDARD-NEXT:    br i1 [[EXITCOND_NOT]], label [[DOTPREHEADER]], label [[DOTPREHEADER35]]
+; CHECK-STANDARD:       .preheader:
+; CHECK-STANDARD-NEXT:    [[DOTLCSSA3641_LCSSA:%.*]] = phi i32 [ 0, [[TMP3:%.*]] ], [ [[TMP29]], [[DOTPREHEADER35]] ]
+; CHECK-STANDARD-NEXT:    [[P:%.*]] = phi i1 [ false, [[TMP3]] ], [ [[NOT_OR_COND]], [[DOTPREHEADER35]] ]
+; CHECK-STANDARD-NEXT:    [[Q:%.*]] = select i1 [[P]], i32 [[DOTLCSSA3641_LCSSA]], i32 1
+; CHECK-STANDARD-NEXT:    ret i32 [[Q]]
+;
+; CHECK-FORCED-LABEL: @minloc1_otherunusednot(
+; CHECK-FORCED-NEXT:    [[TMP4:%.*]] = getelementptr i8, ptr [[TMP0:%.*]], i64 40
+; CHECK-FORCED-NEXT:    [[TMP5:%.*]] = load i64, ptr [[TMP4]], align 8
+; CHECK-FORCED-NEXT:    [[TMP6:%.*]] = getelementptr i8, ptr [[TMP0]], i64 64
+; CHECK-FORCED-NEXT:    [[TMP7:%.*]] = load i64, ptr [[TMP6]], align 8
+; CHECK-FORCED-NEXT:    [[TMP8:%.*]] = getelementptr i8, ptr [[TMP0]], i64 80
+; CHECK-FORCED-NEXT:    [[TMP9:%.*]] = load i64, ptr [[TMP8]], align 8
+; CHECK-FORCED-NEXT:    [[TMP10:%.*]] = getelementptr i8, ptr [[TMP0]], i64 88
+; CHECK-FORCED-NEXT:    [[TMP11:%.*]] = load i64, ptr [[TMP10]], align 8
+; CHECK-FORCED-NEXT:    [[TMP12:%.*]] = load ptr, ptr [[TMP0]], align 8
+; CHECK-FORCED-NEXT:    [[TMP13:%.*]] = load i32, ptr [[TMP1:%.*]], align 4
+; CHECK-FORCED-NEXT:    [[TMP14:%.*]] = sext i32 [[TMP13]] to i64
+; CHECK-FORCED-NEXT:    [[TMP15:%.*]] = add nsw i64 [[TMP14]], -1
+; CHECK-FORCED-NEXT:    [[TMP16:%.*]] = mul i64 [[TMP15]], [[TMP5]]
+; CHECK-FORCED-NEXT:    [[TMP17:%.*]] = getelementptr i8, ptr [[TMP12]], i64 [[TMP16]]
+; CHECK-FORCED-NEXT:    [[TMP18:%.*]] = shl i64 [[TMP7]], 3
+; CHECK-FORCED-NEXT:    [[TMP19:%.*]] = getelementptr i8, ptr [[TMP17]], i64 [[TMP18]]
+; CHECK-FORCED-NEXT:    [[TMP20:%.*]] = load i32, ptr [[TMP2:%.*]], align 4
+; CHECK-FORCED-NEXT:    [[DOTNOT:%.*]] = icmp slt i64 [[TMP9]], 1
+; CHECK-FORCED-NEXT:    br i1 [[DOTNOT]], label [[DOTPREHEADER:%.*]], label [[DOTPREHEADER35_LR_PH:%.*]]
+; CHECK-FORCED:       .preheader35.lr.ph:
+; CHECK-FORCED-NEXT:    [[TMP21:%.*]] = sub i64 0, [[TMP7]]
+; CHECK-FORCED-NEXT:    br label [[DOTPREHEADER35:%.*]]
+; CHECK-FORCED:       .preheader35:
+; CHECK-FORCED-NEXT:    [[TMP22:%.*]] = phi i32 [ 2147483647, [[DOTPREHEADER35_LR_PH]] ], [ [[TMP30:%.*]], [[SE...
[truncated]

github-actions · 2024-05-17T10:42:47Z

✅ With the latest revision this PR passed the C/C++ code formatter.

sjoerdmeijer

Usually these are canonicalized in instcombine to remove the not and invert the true/false values, but this will not happen for Loginal operations

Just wanted to quickly check the reason instcombine isn't doing canonicalisations for these cases. I guess it's because the limited scope that instcombine has, so it won't trigger, is that right?

llvm/lib/CodeGen/SelectOptimize.cpp

davemgreen · 2024-05-20T13:33:55Z

Thanks

Just wanted to quickly check the reason instcombine isn't doing canonicalisations for these cases. I guess it's because the limited scope that instcombine has, so it won't trigger, is that right?

It will be because select x, true, y is treated as or x, y (but poison blocking), and it does not want to break that canonical form.

sjoerdmeijer

Cheers, LGTM

sapostolakis

LGTM. I only had two minor comments.

sapostolakis · 2024-05-22T13:07:38Z

llvm/lib/CodeGen/SelectOptimize.cpp

@@ -153,14 +157,22 @@ class SelectOptimizeImpl {
    bool isValid() { return I; }
    operator bool() { return isValid(); }

+    /// Invert the select by inverting the condition and switching the operands.
+    void setInverted() {
+      assert(!Inverted && "Trying to invert and inverted SelectLike");


invert and inverted --> invert an inverted

llvm/lib/CodeGen/SelectOptimize.cpp

This patch attempts to help the SelectOpt pass detect select groups made up of conditions and not(conditions). Usually these are canonicalized in instcombine to remove the not and invert the true/false values, but this will not happen for Loginal operations, which can be beneficial to convert if they are part of a larger select group. The handling for not's are mostly handled in the SelectLike, which can be marked as Inverted in order to reverse the TrueValue and FalseValue. This helps fix a regression in fortran minloc constructs, after llvm#84628 helped simplify a loop with branches into a loop with selects.

davemgreen · 2024-05-22T18:28:08Z

Thanks

davemgreen requested review from sapostolakis, sjoerdmeijer and david-arm May 17, 2024 10:40

llvmbot added the backend:AArch64 label May 17, 2024

davemgreen force-pushed the gh-selectopt-notcond branch from 737719f to dc2e4cd Compare May 17, 2024 10:43

sjoerdmeijer reviewed May 20, 2024

View reviewed changes

llvm/lib/CodeGen/SelectOptimize.cpp Outdated Show resolved Hide resolved

llvm/lib/CodeGen/SelectOptimize.cpp Outdated Show resolved Hide resolved

llvm/lib/CodeGen/SelectOptimize.cpp Outdated Show resolved Hide resolved

davemgreen force-pushed the gh-selectopt-notcond branch from dc2e4cd to ed07b6c Compare May 20, 2024 13:34

sjoerdmeijer approved these changes May 22, 2024

View reviewed changes

sapostolakis approved these changes May 22, 2024

View reviewed changes

davemgreen added 2 commits May 22, 2024 19:03

Address comments

10d4675

davemgreen force-pushed the gh-selectopt-notcond branch from ed07b6c to 0b7e99d Compare May 22, 2024 18:05

Address comments

5c10ac7

davemgreen force-pushed the gh-selectopt-notcond branch from 0b7e99d to 5c10ac7 Compare May 22, 2024 18:25

davemgreen merged commit 0a62a99 into llvm:main May 22, 2024
3 of 4 checks passed

davemgreen deleted the gh-selectopt-notcond branch May 22, 2024 18:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SelectOpt] Add handling for not conditions. #92517

[SelectOpt] Add handling for not conditions. #92517

davemgreen commented May 17, 2024

llvmbot commented May 17, 2024

github-actions bot commented May 17, 2024 •

edited

Loading

sjoerdmeijer left a comment

davemgreen commented May 20, 2024

sjoerdmeijer left a comment

sapostolakis left a comment

sapostolakis May 22, 2024

davemgreen commented May 22, 2024

[SelectOpt] Add handling for not conditions. #92517

[SelectOpt] Add handling for not conditions. #92517

Conversation

davemgreen commented May 17, 2024

llvmbot commented May 17, 2024

github-actions bot commented May 17, 2024 • edited Loading

sjoerdmeijer left a comment

Choose a reason for hiding this comment

davemgreen commented May 20, 2024

sjoerdmeijer left a comment

Choose a reason for hiding this comment

sapostolakis left a comment

Choose a reason for hiding this comment

sapostolakis May 22, 2024

Choose a reason for hiding this comment

davemgreen commented May 22, 2024

github-actions bot commented May 17, 2024 •

edited

Loading