-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LV][EVL] Support in-loop reduction using tail folding with EVL. #90184
Conversation
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-llvm-transforms Author: Mel Chen (Mel-Chen) ChangesFollowing from #87816, add VPReductionEVLRecipe to describe vector predication reduction. Address one of TODOs from #76172. Patch is 148.24 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/90184.diff 12 Files Affected:
diff --git a/llvm/include/llvm/IR/IRBuilder.h b/llvm/include/llvm/IR/IRBuilder.h
index b6534a1962a2f5..4db1fe5ff93aef 100644
--- a/llvm/include/llvm/IR/IRBuilder.h
+++ b/llvm/include/llvm/IR/IRBuilder.h
@@ -746,49 +746,68 @@ class IRBuilderBase {
private:
CallInst *getReductionIntrinsic(Intrinsic::ID ID, Value *Src);
+ // Helper function for creating VP reduce intrinsic call.
+ CallInst *getReductionIntrinsic(Intrinsic::ID ID, Value *Acc, Value *Src,
+ Value *Mask, Value *EVL);
+
public:
/// Create a sequential vector fadd reduction intrinsic of the source vector.
/// The first parameter is a scalar accumulator value. An unordered reduction
/// can be created by adding the reassoc fast-math flag to the resulting
/// sequential reduction.
CallInst *CreateFAddReduce(Value *Acc, Value *Src);
+ CallInst *CreateFAddReduce(Value *Acc, Value *Src, Value *EVL,
+ Value *Mask = nullptr);
/// Create a sequential vector fmul reduction intrinsic of the source vector.
/// The first parameter is a scalar accumulator value. An unordered reduction
/// can be created by adding the reassoc fast-math flag to the resulting
/// sequential reduction.
CallInst *CreateFMulReduce(Value *Acc, Value *Src);
+ CallInst *CreateFMulReduce(Value *Acc, Value *Src, Value *EVL,
+ Value *Mask = nullptr);
/// Create a vector int add reduction intrinsic of the source vector.
CallInst *CreateAddReduce(Value *Src);
+ CallInst *CreateAddReduce(Value *Src, Value *EVL, Value *Mask = nullptr);
/// Create a vector int mul reduction intrinsic of the source vector.
CallInst *CreateMulReduce(Value *Src);
+ CallInst *CreateMulReduce(Value *Src, Value *EVL, Value *Mask = nullptr);
/// Create a vector int AND reduction intrinsic of the source vector.
CallInst *CreateAndReduce(Value *Src);
+ CallInst *CreateAndReduce(Value *Src, Value *EVL, Value *Mask = nullptr);
/// Create a vector int OR reduction intrinsic of the source vector.
CallInst *CreateOrReduce(Value *Src);
+ CallInst *CreateOrReduce(Value *Src, Value *EVL, Value *Mask = nullptr);
/// Create a vector int XOR reduction intrinsic of the source vector.
CallInst *CreateXorReduce(Value *Src);
+ CallInst *CreateXorReduce(Value *Src, Value *EVL, Value *Mask = nullptr);
/// Create a vector integer max reduction intrinsic of the source
/// vector.
CallInst *CreateIntMaxReduce(Value *Src, bool IsSigned = false);
+ CallInst *CreateIntMaxReduce(Value *Src, Value *EVL, bool IsSigned = false,
+ Value *Mask = nullptr);
/// Create a vector integer min reduction intrinsic of the source
/// vector.
CallInst *CreateIntMinReduce(Value *Src, bool IsSigned = false);
+ CallInst *CreateIntMinReduce(Value *Src, Value *EVL, bool IsSigned = false,
+ Value *Mask = nullptr);
/// Create a vector float max reduction intrinsic of the source
/// vector.
CallInst *CreateFPMaxReduce(Value *Src);
+ CallInst *CreateFPMaxReduce(Value *Src, Value *EVL, Value *Mask = nullptr);
/// Create a vector float min reduction intrinsic of the source
/// vector.
CallInst *CreateFPMinReduce(Value *Src);
+ CallInst *CreateFPMinReduce(Value *Src, Value *EVL, Value *Mask = nullptr);
/// Create a vector float maximum reduction intrinsic of the source
/// vector. This variant follows the NaN and signed zero semantic of
diff --git a/llvm/include/llvm/Transforms/Utils/LoopUtils.h b/llvm/include/llvm/Transforms/Utils/LoopUtils.h
index 187ace3a0cbedf..5003fa66100b46 100644
--- a/llvm/include/llvm/Transforms/Utils/LoopUtils.h
+++ b/llvm/include/llvm/Transforms/Utils/LoopUtils.h
@@ -403,6 +403,9 @@ Value *getShuffleReduction(IRBuilderBase &Builder, Value *Src, unsigned Op,
/// Fast-math-flags are propagated using the IRBuilder's setting.
Value *createSimpleTargetReduction(IRBuilderBase &B, Value *Src,
RecurKind RdxKind);
+Value *createSimpleTargetReduction(IRBuilderBase &B, Value *Src,
+ RecurKind RdxKind, Value *EVL,
+ Value *Mask = nullptr);
/// Create a target reduction of the given vector \p Src for a reduction of the
/// kind RecurKind::IAnyOf or RecurKind::FAnyOf. The reduction operation is
@@ -423,6 +426,9 @@ Value *createTargetReduction(IRBuilderBase &B, const RecurrenceDescriptor &Desc,
Value *createOrderedReduction(IRBuilderBase &B,
const RecurrenceDescriptor &Desc, Value *Src,
Value *Start);
+Value *createOrderedReduction(IRBuilderBase &B,
+ const RecurrenceDescriptor &Desc, Value *Src,
+ Value *Start, Value *EVL, Value *Mask = nullptr);
/// Get the intersection (logical and) of all of the potential IR flags
/// of each scalar operation (VL) that will be converted into a vector (I).
diff --git a/llvm/lib/IR/IRBuilder.cpp b/llvm/lib/IR/IRBuilder.cpp
index d6746d1d438242..90f637940d00da 100644
--- a/llvm/lib/IR/IRBuilder.cpp
+++ b/llvm/lib/IR/IRBuilder.cpp
@@ -414,6 +414,20 @@ CallInst *IRBuilderBase::getReductionIntrinsic(Intrinsic::ID ID, Value *Src) {
return CreateCall(Decl, Ops);
}
+CallInst *IRBuilderBase::getReductionIntrinsic(Intrinsic::ID ID, Value *Acc,
+ Value *Src, Value *Mask,
+ Value *EVL) {
+ Module *M = GetInsertBlock()->getParent()->getParent();
+ auto *SrcTy = cast<VectorType>(Src->getType());
+ EVL = CreateIntCast(EVL, getInt32Ty(), /*isSigned=*/false);
+ if (!Mask)
+ Mask = CreateVectorSplat(SrcTy->getElementCount(), getTrue());
+ Value *Ops[] = {Acc, Src, Mask, EVL};
+ Type *Tys[] = {SrcTy};
+ auto Decl = Intrinsic::getDeclaration(M, ID, Tys);
+ return CreateCall(Decl, Ops);
+}
+
CallInst *IRBuilderBase::CreateFAddReduce(Value *Acc, Value *Src) {
Module *M = GetInsertBlock()->getParent()->getParent();
Value *Ops[] = {Acc, Src};
@@ -422,6 +436,11 @@ CallInst *IRBuilderBase::CreateFAddReduce(Value *Acc, Value *Src) {
return CreateCall(Decl, Ops);
}
+CallInst *IRBuilderBase::CreateFAddReduce(Value *Acc, Value *Src, Value *EVL,
+ Value *Mask) {
+ return getReductionIntrinsic(Intrinsic::vp_reduce_fadd, Acc, Src, Mask ,EVL);
+}
+
CallInst *IRBuilderBase::CreateFMulReduce(Value *Acc, Value *Src) {
Module *M = GetInsertBlock()->getParent()->getParent();
Value *Ops[] = {Acc, Src};
@@ -430,46 +449,149 @@ CallInst *IRBuilderBase::CreateFMulReduce(Value *Acc, Value *Src) {
return CreateCall(Decl, Ops);
}
+CallInst *IRBuilderBase::CreateFMulReduce(Value *Acc, Value *Src, Value *EVL,
+ Value *Mask) {
+ return getReductionIntrinsic(Intrinsic::vp_reduce_fmul, Acc, Src, Mask, EVL);
+}
+
CallInst *IRBuilderBase::CreateAddReduce(Value *Src) {
return getReductionIntrinsic(Intrinsic::vector_reduce_add, Src);
}
+CallInst *IRBuilderBase::CreateAddReduce(Value *Src, Value *EVL, Value *Mask) {
+ auto *SrcTy = cast<VectorType>(Src->getType());
+ auto *EltTy = SrcTy->getElementType();
+ return getReductionIntrinsic(Intrinsic::vp_reduce_add,
+ ConstantInt::get(EltTy, 0), Src, Mask, EVL);
+}
+
CallInst *IRBuilderBase::CreateMulReduce(Value *Src) {
return getReductionIntrinsic(Intrinsic::vector_reduce_mul, Src);
}
+CallInst *IRBuilderBase::CreateMulReduce(Value *Src, Value *EVL, Value *Mask) {
+ auto *SrcTy = cast<VectorType>(Src->getType());
+ auto *EltTy = SrcTy->getElementType();
+ return getReductionIntrinsic(Intrinsic::vp_reduce_mul,
+ ConstantInt::get(EltTy, 1), Src, Mask, EVL);
+}
+
CallInst *IRBuilderBase::CreateAndReduce(Value *Src) {
return getReductionIntrinsic(Intrinsic::vector_reduce_and, Src);
}
+CallInst *IRBuilderBase::CreateAndReduce(Value *Src, Value *EVL, Value *Mask) {
+ auto *SrcTy = cast<VectorType>(Src->getType());
+ auto *EltTy = SrcTy->getElementType();
+ return getReductionIntrinsic(Intrinsic::vp_reduce_and,
+ Constant::getAllOnesValue(EltTy), Src, Mask,
+ EVL);
+}
+
CallInst *IRBuilderBase::CreateOrReduce(Value *Src) {
return getReductionIntrinsic(Intrinsic::vector_reduce_or, Src);
}
+CallInst *IRBuilderBase::CreateOrReduce(Value *Src, Value *EVL, Value *Mask) {
+ auto *SrcTy = cast<VectorType>(Src->getType());
+ auto *EltTy = SrcTy->getElementType();
+ return getReductionIntrinsic(Intrinsic::vp_reduce_or,
+ ConstantInt::get(EltTy, 0), Src, Mask, EVL);
+}
+
CallInst *IRBuilderBase::CreateXorReduce(Value *Src) {
return getReductionIntrinsic(Intrinsic::vector_reduce_xor, Src);
}
+CallInst *IRBuilderBase::CreateXorReduce(Value *Src, Value *EVL, Value *Mask) {
+ auto *SrcTy = cast<VectorType>(Src->getType());
+ auto *EltTy = SrcTy->getElementType();
+ return getReductionIntrinsic(Intrinsic::vp_reduce_xor,
+ ConstantInt::get(EltTy, 0), Src, Mask, EVL);
+}
+
CallInst *IRBuilderBase::CreateIntMaxReduce(Value *Src, bool IsSigned) {
auto ID =
IsSigned ? Intrinsic::vector_reduce_smax : Intrinsic::vector_reduce_umax;
return getReductionIntrinsic(ID, Src);
}
+CallInst *IRBuilderBase::CreateIntMaxReduce(Value *Src, Value *EVL,
+ bool IsSigned, Value *Mask) {
+ auto *SrcTy = cast<VectorType>(Src->getType());
+ auto *EltTy = SrcTy->getElementType();
+ return getReductionIntrinsic(
+ IsSigned ? Intrinsic::vp_reduce_smax : Intrinsic::vp_reduce_umax,
+ IsSigned ? ConstantInt::get(EltTy, APInt::getSignedMinValue(
+ EltTy->getIntegerBitWidth()))
+ : ConstantInt::get(EltTy, 0),
+ Src, Mask, EVL);
+}
+
CallInst *IRBuilderBase::CreateIntMinReduce(Value *Src, bool IsSigned) {
auto ID =
IsSigned ? Intrinsic::vector_reduce_smin : Intrinsic::vector_reduce_umin;
return getReductionIntrinsic(ID, Src);
}
+CallInst *IRBuilderBase::CreateIntMinReduce(Value *Src, Value *EVL,
+ bool IsSigned, Value *Mask) {
+ auto *SrcTy = cast<VectorType>(Src->getType());
+ auto *EltTy = SrcTy->getElementType();
+ return getReductionIntrinsic(
+ IsSigned ? Intrinsic::vp_reduce_smin : Intrinsic::vp_reduce_umin,
+ IsSigned ? ConstantInt::get(EltTy, APInt::getSignedMaxValue(
+ EltTy->getIntegerBitWidth()))
+ : Constant::getAllOnesValue(EltTy),
+ Src, Mask, EVL);
+}
+
CallInst *IRBuilderBase::CreateFPMaxReduce(Value *Src) {
return getReductionIntrinsic(Intrinsic::vector_reduce_fmax, Src);
}
+CallInst *IRBuilderBase::CreateFPMaxReduce(Value *Src, Value *EVL,
+ Value *Mask) {
+ auto *SrcTy = cast<VectorType>(Src->getType());
+ auto *EltTy = SrcTy->getElementType();
+ FastMathFlags FMF = getFastMathFlags();
+ Value *Neutral;
+ if (FMF.noNaNs())
+ Neutral = FMF.noInfs()
+ ? ConstantFP::get(
+ EltTy, APFloat::getLargest(EltTy->getFltSemantics(),
+ /*Negative=*/true))
+ : ConstantFP::getInfinity(EltTy, true);
+ else
+ Neutral = ConstantFP::getQNaN(EltTy, /*Negative=*/true);
+
+ return getReductionIntrinsic(Intrinsic::vp_reduce_fmax, Neutral, Src, Mask,
+ EVL);
+}
+
CallInst *IRBuilderBase::CreateFPMinReduce(Value *Src) {
return getReductionIntrinsic(Intrinsic::vector_reduce_fmin, Src);
}
+CallInst *IRBuilderBase::CreateFPMinReduce(Value *Src, Value *EVL,
+ Value *Mask) {
+ auto *SrcTy = cast<VectorType>(Src->getType());
+ auto *EltTy = SrcTy->getElementType();
+ FastMathFlags FMF = getFastMathFlags();
+ Value *Neutral;
+ if (FMF.noNaNs())
+ Neutral = FMF.noInfs()
+ ? ConstantFP::get(
+ EltTy, APFloat::getLargest(EltTy->getFltSemantics(),
+ /*Negative=*/false))
+ : ConstantFP::getInfinity(EltTy, false);
+ else
+ Neutral = ConstantFP::getQNaN(EltTy, /*Negative=*/false);
+
+ return getReductionIntrinsic(Intrinsic::vp_reduce_fmin, Neutral, Src, Mask,
+ EVL);
+}
+
CallInst *IRBuilderBase::CreateFPMaximumReduce(Value *Src) {
return getReductionIntrinsic(Intrinsic::vector_reduce_fmaximum, Src);
}
diff --git a/llvm/lib/Transforms/Utils/LoopUtils.cpp b/llvm/lib/Transforms/Utils/LoopUtils.cpp
index 73c5d636782294..d0abcdfb1440ab 100644
--- a/llvm/lib/Transforms/Utils/LoopUtils.cpp
+++ b/llvm/lib/Transforms/Utils/LoopUtils.cpp
@@ -1204,6 +1204,48 @@ Value *llvm::createSimpleTargetReduction(IRBuilderBase &Builder, Value *Src,
}
}
+Value *llvm::createSimpleTargetReduction(IRBuilderBase &Builder, Value *Src,
+ RecurKind RdxKind, Value *EVL,
+ Value *Mask) {
+ auto *SrcVecEltTy = cast<VectorType>(Src->getType())->getElementType();
+ switch (RdxKind) {
+ case RecurKind::Add:
+ return Builder.CreateAddReduce(Src, EVL, Mask);
+ case RecurKind::Mul:
+ return Builder.CreateMulReduce(Src, EVL, Mask);
+ case RecurKind::And:
+ return Builder.CreateAndReduce(Src, EVL, Mask);
+ case RecurKind::Or:
+ return Builder.CreateOrReduce(Src, EVL, Mask);
+ case RecurKind::Xor:
+ return Builder.CreateXorReduce(Src, EVL, Mask);
+ case RecurKind::FMulAdd:
+ case RecurKind::FAdd:
+ return Builder.CreateFAddReduce(ConstantFP::getNegativeZero(SrcVecEltTy),
+ Src, EVL, Mask);
+ case RecurKind::FMul:
+ return Builder.CreateFMulReduce(ConstantFP::get(SrcVecEltTy, 1.0), Src, EVL,
+ Mask);
+ case RecurKind::SMax:
+ return Builder.CreateIntMaxReduce(Src, EVL, true, Mask);
+ case RecurKind::SMin:
+ return Builder.CreateIntMinReduce(Src, EVL, true, Mask);
+ case RecurKind::UMax:
+ return Builder.CreateIntMaxReduce(Src, EVL, false, Mask);
+ case RecurKind::UMin:
+ return Builder.CreateIntMinReduce(Src, EVL, false, Mask);
+ case RecurKind::FMax:
+ return Builder.CreateFPMaxReduce(Src, EVL, Mask);
+ case RecurKind::FMin:
+ return Builder.CreateFPMinReduce(Src, EVL, Mask);
+ case RecurKind::FMinimum:
+ case RecurKind::FMaximum:
+ assert(0 && "FMaximum/FMinimum reduction VP intrinsic is not supported.");
+ default:
+ llvm_unreachable("Unhandled opcode");
+ }
+}
+
Value *llvm::createTargetReduction(IRBuilderBase &B,
const RecurrenceDescriptor &Desc, Value *Src,
PHINode *OrigPhi) {
@@ -1232,6 +1274,20 @@ Value *llvm::createOrderedReduction(IRBuilderBase &B,
return B.CreateFAddReduce(Start, Src);
}
+Value *llvm::createOrderedReduction(IRBuilderBase &B,
+ const RecurrenceDescriptor &Desc,
+ Value *Src, Value *Start, Value *EVL,
+ Value *Mask) {
+ assert((Desc.getRecurrenceKind() == RecurKind::FAdd ||
+ Desc.getRecurrenceKind() == RecurKind::FMulAdd) &&
+ "Unexpected reduction kind");
+ assert(Src->getType()->isVectorTy() && "Expected a vector type");
+ assert(!Start->getType()->isVectorTy() && "Expected a scalar type");
+ assert(EVL->getType()->isIntegerTy() && "Expected a integer type");
+
+ return B.CreateFAddReduce(Start, Src, EVL, Mask);
+}
+
void llvm::propagateIRFlags(Value *I, ArrayRef<Value *> VL, Value *OpValue,
bool IncludeWrapFlags) {
auto *VecOp = dyn_cast<Instruction>(I);
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 33c4decd58a6c2..1db531e170a4bf 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -1526,6 +1526,17 @@ class LoopVectorizationCostModel {
ForceTailFoldingStyle.getValue());
if (ForceTailFoldingStyle != TailFoldingStyle::DataWithEVL)
return;
+
+ // Block folding with EVL since vector-predication intrinsics have not
+ // support FMinimum and FMaximum reduction.
+ // FIXME: remove this check once llvm.vp.reduce.fminimum/fmaximum are
+ // supported
+ bool ContainsFMinimumOrFMaximumReduction =
+ any_of(Legal->getReductionVars(), [&](auto &Reduction) {
+ const RecurrenceDescriptor &RdxDesc = Reduction.second;
+ RecurKind Kind = RdxDesc.getRecurrenceKind();
+ return Kind == RecurKind::FMinimum || Kind == RecurKind::FMaximum;
+ });
// Override forced styles if needed.
// FIXME: use actual opcode/data type for analysis here.
// FIXME: Investigate opportunity for fixed vector factor.
@@ -1535,8 +1546,7 @@ class LoopVectorizationCostModel {
!EnableVPlanNativePath &&
// FIXME: implement support for max safe dependency distance.
Legal->isSafeForAnyVectorWidth() &&
- // FIXME: remove this once reductions are supported.
- Legal->getReductionVars().empty();
+ !ContainsFMinimumOrFMaximumReduction;
if (!EVLIsLegal) {
// If for some reason EVL mode is unsupported, fallback to
// DataWithoutLaneMask to try to vectorize the loop with folded tail
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index c74329a0bcc4ac..a444064dab692a 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -843,6 +843,7 @@ class VPSingleDefRecipe : public VPRecipeBase, public VPValue {
case VPRecipeBase::VPDerivedIVSC:
case VPRecipeBase::VPExpandSCEVSC:
case VPRecipeBase::VPInstructionSC:
+ case VPRecipeBase::VPReductionEVLSC:
case VPRecipeBase::VPReductionSC:
case VPRecipeBase::VPReplicateSC:
case VPRecipeBase::VPScalarIVStepsSC:
@@ -2110,6 +2111,12 @@ class VPReductionRecipe : public VPSingleDefRecipe {
VPSlotTracker &SlotTracker) const override;
#endif
+ /// Return the recurrence decriptor for the in-loop reduction.
+ const RecurrenceDescriptor &getRecurrenceDescriptor() const {
+ return RdxDesc;
+ }
+ /// Return true if the in-loop reduction is ordered.
+ bool isOrdered() const { return IsOrdered; };
/// The VPValue of the scalar Chain being accumulated.
VPValue *getChainOp() const { return getOperand(0); }
/// The VPValue of the vector value to be reduced.
@@ -2120,6 +2127,63 @@ class VPReductionRecipe : public VPSingleDefRecipe {
}
};
+/// A recipe to represent inloop reduction operations with vector-predication
+/// intrinsics, performing a reduction on a vector operand with the explicit
+/// vector length (EVL) into a scalar value, and adding the result to a chain.
+/// The Operands are {ChainOp, VecOp, EVL, [Condition]}.
+class VPReductionEVLRecipe : public VPSingleDefRecipe {
+ /// The recurrence decriptor for the reduction in question.
+ const RecurrenceDescriptor &RdxDesc;
+ bool IsOrdered;
+
+public:
+ VPReductionEVLRecipe(VPReductionRecipe *R, VPValue *EVL)
+ : VPSingleDefRecipe(
+ VPDef::VPReductionEVLSC,
+ ArrayRef<VPValue *>({R->getChainOp(), R->getVecOp(), EVL}),
+ R->getUnderlyingInstr()),
+ RdxDesc(R->getRecurrenceDescriptor()), IsOrdered(R->isOrdered()) {
+ VPValue *CondOp = R->getCondOp();
+ if (CondOp)
+ addOperand(CondOp);
+ };
+
+ ~VPReductionEVLRecipe() override = default;
+
+ VPReductionEVLRecipe *clone() override {
+ llvm_unreachable("cloning not implemented yet");
+ }
+
+ VP_CLASSOF_IMPL(VPDef::VPReductionEVLSC)
+
+ /// Generate the reduction in the loop
+ void execute(VPTransformState &State) override;
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+ /// Print the recipe.
+ void print(raw_ostream &O, const Twine &Indent,
+ VPSlotTracker &SlotTracker) const override;
+#endif
+
+ /// The VPValue of the scalar Chain being accumulated.
+ VPValue *getChainOp() const { return g...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
df1c995
to
af3e8a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The title mentions this adds support for in-loop reductions, but I wasn't able to find a check to make sure we only vectorize in-loop reductions?
All tests seem to pass flags guiding towards the use of in-loop/ordered reductions, so the case where the regular reduction strategy is chosen may not be tested well
llvm/test/Transforms/LoopVectorize/RISCV/vectorize-force-tail-with-evl-reduction.ll
Show resolved
Hide resolved
llvm/include/llvm/IR/VectorBuilder.h
Outdated
@@ -57,6 +58,11 @@ class VectorBuilder { | |||
return RetType(); | |||
} | |||
|
|||
// Helper function for creating VP intrinsic call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
independent of this change, but if VectorBuilder
only supports generation of vector-predication intrinsics, then it would be better to call it VectorPredicationBuilder
to avoid confusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, this can be adjusted later. (But I would prefer a shorter name, perhaps just VectorPredBuilder would be good enough?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or VPBuilder
, like there's IRBuilder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or
VPBuilder
, like there'sIRBuilder
That's unfortunate, as there is already a class named VPBuilder
in LoopVectorizationPlanner.h.
/// VPlan-based builder utility analogous to IRBuilder.
class VPBuilder {
VPBasicBlock *BB = nullptr;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth being more explicit for the name in the utility in llvm/IR
, VectorPredBuilder
would sound good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use ///
for doc-comment
Indeed, we might need to change the title. |
@fhahn ping |
Sounds good to me, but it would be good to have some upstream buildbot that builds some code with the various options to have some runtime testing. |
llvm/include/llvm/IR/VectorBuilder.h
Outdated
@@ -57,6 +58,11 @@ class VectorBuilder { | |||
return RetType(); | |||
} | |||
|
|||
// Helper function for creating VP intrinsic call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth being more explicit for the name in the utility in llvm/IR
, VectorPredBuilder
would sound good to me.
llvm/test/Transforms/LoopVectorize/RISCV/vectorize-force-tail-with-evl-reduction.ll
Outdated
Show resolved
Hide resolved
llvm/test/Transforms/LoopVectorize/RISCV/vectorize-force-tail-with-evl-reduction.ll
Show resolved
Hide resolved
…ion." This reverts commit 8488520.
******************** Failed Tests (2): LLVM-Unit :: Transforms/Vectorize/./VectorizeTests/21/51 LLVM-Unit :: Transforms/Vectorize/./VectorizeTests/26/51
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm/IR
should not depend on llvm/Analysis
.
@@ -15,6 +15,7 @@ | |||
#ifndef LLVM_IR_VECTORBUILDER_H | |||
#define LLVM_IR_VECTORBUILDER_H | |||
|
|||
#include <llvm/Analysis/IVDescriptors.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a layering violation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out this issue. I opened #99276 to fix it. Please take a look, thanks a lot.
…m#90184) Summary: Following from llvm#87816, add VPReductionEVLRecipe to describe vector predication reduction. Address one of TODOs from llvm#76172. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D59822470
) Summary: Following from #87816, add VPReductionEVLRecipe to describe vector predication reduction. Address one of TODOs from #76172. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60251485
Following from #87816, add VPReductionEVLRecipe to describe vector predication reduction.
Address one of TODOs from #76172.