-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NFC] Use initial-stack-allocations for more data structures #110544
Conversation
This replaces some of the most frequent offenders of using a DenseMap that cause a malloc, where the typical element-count is small enough to fit in an initial stack allocation. Most of these are fairly obvious, one to highlight is the collectOffset method of GEP instructions: if there's a GEP, of course it's going to have at least one offset, but every time we've called collectOffset we end up calling malloc as well for the DenseMap in the MapVector.
I've done a little more testing on these as, being nested, there's more initialization to do when they're created. The net effect is positive on the compile-time-tracker.
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-backend-amdgpu Author: Jeremy Morse (jmorse) ChangesI've scanned for more scenarios where calls to malloc can be avoided through the use of SmallDenseMaps instead of DenseMaps, i.e. initial stack allocations. There are two commits in this PR, one is ordinary replacements, the second involves nested containers. I figured it's worth testing the changes to nested containers as that was identified as a risk in an earlier PR.
Both are positive for CTMark. Most of the changes are fairly obvious, one to highlight is the collectOffset method of GEP instructions: if there's a GEP, of course it's going to have at least one offset, but every time we've called collectOffset we end up calling malloc as well for the DenseMap in the MapVector. Full diff: https://github.com/llvm/llvm-project/pull/110544.diff 17 Files Affected:
diff --git a/llvm/include/llvm/IR/Instructions.h b/llvm/include/llvm/IR/Instructions.h
index 75a059760f48fa..695a7a6aa9f254 100644
--- a/llvm/include/llvm/IR/Instructions.h
+++ b/llvm/include/llvm/IR/Instructions.h
@@ -1117,7 +1117,7 @@ class GetElementPtrInst : public Instruction {
/// the base GEP pointer.
bool accumulateConstantOffset(const DataLayout &DL, APInt &Offset) const;
bool collectOffset(const DataLayout &DL, unsigned BitWidth,
- MapVector<Value *, APInt> &VariableOffsets,
+ SmallMapVector<Value *, APInt, 4> &VariableOffsets,
APInt &ConstantOffset) const;
// Methods for support type inquiry through isa, cast, and dyn_cast:
static bool classof(const Instruction *I) {
diff --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h
index 88b9bfc0be4b15..0e9f6ed35dcb4e 100644
--- a/llvm/include/llvm/IR/Operator.h
+++ b/llvm/include/llvm/IR/Operator.h
@@ -528,7 +528,7 @@ class GEPOperator
/// Collect the offset of this GEP as a map of Values to their associated
/// APInt multipliers, as well as a total Constant Offset.
bool collectOffset(const DataLayout &DL, unsigned BitWidth,
- MapVector<Value *, APInt> &VariableOffsets,
+ SmallMapVector<Value *, APInt, 4> &VariableOffsets,
APInt &ConstantOffset) const;
};
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index 6f211abb299e7a..e4792f7e6ce72e 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -2831,7 +2831,7 @@ static void emitRangeList(
// Gather all the ranges that apply to the same section so they can share
// a base address entry.
- MapVector<const MCSection *, std::vector<decltype(&*R.begin())>> SectionRanges;
+ SmallMapVector<const MCSection *, std::vector<decltype(&*R.begin())>, 16> SectionRanges;
for (const auto &Range : R)
SectionRanges[&Range.Begin->getSection()].push_back(&Range);
diff --git a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
index f157ffc6bcc2d7..68db65ace9a427 100644
--- a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
+++ b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
@@ -1046,7 +1046,7 @@ class VLocTracker {
/// transfer function for this block, as part of the dataflow analysis. The
/// movement of values between locations inside of a block is handled at a
/// much later stage, in the TransferTracker class.
- MapVector<DebugVariableID, DbgValue> Vars;
+ SmallMapVector<DebugVariableID, DbgValue, 8> Vars;
SmallDenseMap<DebugVariableID, const DILocation *, 8> Scopes;
MachineBasicBlock *MBB = nullptr;
const OverlapMap &OverlappingFragments;
@@ -1128,7 +1128,7 @@ class InstrRefBasedLDV : public LDVImpl {
/// Live in/out structure for the variable values: a per-block map of
/// variables to their values.
- using LiveIdxT = DenseMap<const MachineBasicBlock *, DbgValue *>;
+ using LiveIdxT = SmallDenseMap<const MachineBasicBlock *, DbgValue *, 16>;
using VarAndLoc = std::pair<DebugVariableID, DbgValue>;
diff --git a/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp b/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
index 68dece6cf73e91..ca8fac6b437777 100644
--- a/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
+++ b/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
@@ -621,7 +621,7 @@ void ScheduleDAGInstrs::initSUnits() {
}
}
-class ScheduleDAGInstrs::Value2SUsMap : public MapVector<ValueType, SUList> {
+class ScheduleDAGInstrs::Value2SUsMap : public SmallMapVector<ValueType, SUList, 4> {
/// Current total number of SUs in map.
unsigned NumNodes = 0;
@@ -656,7 +656,7 @@ class ScheduleDAGInstrs::Value2SUsMap : public MapVector<ValueType, SUList> {
/// Clears map from all contents.
void clear() {
- MapVector<ValueType, SUList>::clear();
+ SmallMapVector<ValueType, SUList, 4>::clear();
NumNodes = 0;
}
diff --git a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
index e4ee3fd99f16e3..57dbdda68d61b6 100644
--- a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
@@ -183,7 +183,7 @@ class ScheduleDAGRRList : public ScheduleDAGSDNodes {
// Hack to keep track of the inverse of FindCallSeqStart without more crazy
// DAG crawling.
- DenseMap<SUnit*, SUnit*> CallSeqEndForStart;
+ SmallDenseMap<SUnit*, SUnit*, 16> CallSeqEndForStart;
public:
ScheduleDAGRRList(MachineFunction &mf, bool needlatency,
diff --git a/llvm/lib/IR/Instructions.cpp b/llvm/lib/IR/Instructions.cpp
index e95b98a6404432..009e0c03957c97 100644
--- a/llvm/lib/IR/Instructions.cpp
+++ b/llvm/lib/IR/Instructions.cpp
@@ -1584,7 +1584,7 @@ bool GetElementPtrInst::accumulateConstantOffset(const DataLayout &DL,
bool GetElementPtrInst::collectOffset(
const DataLayout &DL, unsigned BitWidth,
- MapVector<Value *, APInt> &VariableOffsets,
+ SmallMapVector<Value *, APInt, 4> &VariableOffsets,
APInt &ConstantOffset) const {
// Delegate to the generic GEPOperator implementation.
return cast<GEPOperator>(this)->collectOffset(DL, BitWidth, VariableOffsets,
diff --git a/llvm/lib/IR/Operator.cpp b/llvm/lib/IR/Operator.cpp
index 6c9862556f5504..f93ff8f6fc8a25 100644
--- a/llvm/lib/IR/Operator.cpp
+++ b/llvm/lib/IR/Operator.cpp
@@ -201,7 +201,7 @@ bool GEPOperator::accumulateConstantOffset(
bool GEPOperator::collectOffset(
const DataLayout &DL, unsigned BitWidth,
- MapVector<Value *, APInt> &VariableOffsets,
+ SmallMapVector<Value *, APInt, 4> &VariableOffsets,
APInt &ConstantOffset) const {
assert(BitWidth == DL.getIndexSizeInBits(getPointerAddressSpace()) &&
"The offset bit width does not match DL specification.");
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
index 7bd618b2d9660c..24bfbff41ec5c0 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
@@ -402,7 +402,7 @@ static Value *GEPToVectorIndex(GetElementPtrInst *GEP, AllocaInst *Alloca,
// TODO: Extracting a "multiple of X" from a GEP might be a useful generic
// helper.
unsigned BW = DL.getIndexTypeSizeInBits(GEP->getType());
- MapVector<Value *, APInt> VarOffsets;
+ SmallMapVector<Value *, APInt, 4> VarOffsets;
APInt ConstOffset(BW, 0);
if (GEP->getPointerOperand()->stripPointerCasts() != Alloca ||
!GEP->collectOffset(DL, BW, VarOffsets, ConstOffset))
diff --git a/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp b/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
index 01642b0677aba3..9943c3cbb9fc7d 100644
--- a/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
+++ b/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
@@ -843,7 +843,7 @@ getStrideAndModOffsetOfGEP(Value *PtrOp, const DataLayout &DL) {
// Return a minimum gep stride, greatest common divisor of consective gep
// index scales(c.f. Bézout's identity).
while (auto *GEP = dyn_cast<GEPOperator>(PtrOp)) {
- MapVector<Value *, APInt> VarOffsets;
+ SmallMapVector<Value *, APInt, 4> VarOffsets;
if (!GEP->collectOffset(DL, BW, VarOffsets, ModOffset))
break;
diff --git a/llvm/lib/Transforms/IPO/AttributorAttributes.cpp b/llvm/lib/Transforms/IPO/AttributorAttributes.cpp
index 416dd09ca874bf..238bdf9c344b08 100644
--- a/llvm/lib/Transforms/IPO/AttributorAttributes.cpp
+++ b/llvm/lib/Transforms/IPO/AttributorAttributes.cpp
@@ -1557,7 +1557,7 @@ bool AAPointerInfoFloating::collectConstantsForGEP(Attributor &A,
const OffsetInfo &PtrOI,
const GEPOperator *GEP) {
unsigned BitWidth = DL.getIndexTypeSizeInBits(GEP->getType());
- MapVector<Value *, APInt> VariableOffsets;
+ SmallMapVector<Value *, APInt, 4> VariableOffsets;
APInt ConstantOffset(BitWidth, 0);
assert(!UsrOI.isUnknown() && !PtrOI.isUnknown() &&
diff --git a/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp b/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
index 7e2721d0c5a5e6..7c06e0c757e1cc 100644
--- a/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
@@ -385,7 +385,7 @@ struct Decomposition {
struct OffsetResult {
Value *BasePtr;
APInt ConstantOffset;
- MapVector<Value *, APInt> VariableOffsets;
+ SmallMapVector<Value *, APInt, 4> VariableOffsets;
bool AllInbounds;
OffsetResult() : BasePtr(nullptr), ConstantOffset(0, uint64_t(0)) {}
@@ -410,7 +410,7 @@ static OffsetResult collectOffsets(GEPOperator &GEP, const DataLayout &DL) {
// If we have a nested GEP, check if we can combine the constant offset of the
// inner GEP with the outer GEP.
if (auto *InnerGEP = dyn_cast<GetElementPtrInst>(Result.BasePtr)) {
- MapVector<Value *, APInt> VariableOffsets2;
+ SmallMapVector<Value *, APInt, 4> VariableOffsets2;
APInt ConstantOffset2(BitWidth, 0);
bool CanCollectInner = InnerGEP->collectOffset(
DL, BitWidth, VariableOffsets2, ConstantOffset2);
diff --git a/llvm/lib/Transforms/Scalar/GVN.cpp b/llvm/lib/Transforms/Scalar/GVN.cpp
index db39d8621d0771..2ba600497e00d3 100644
--- a/llvm/lib/Transforms/Scalar/GVN.cpp
+++ b/llvm/lib/Transforms/Scalar/GVN.cpp
@@ -422,7 +422,7 @@ GVNPass::Expression GVNPass::ValueTable::createGEPExpr(GetElementPtrInst *GEP) {
Type *PtrTy = GEP->getType()->getScalarType();
const DataLayout &DL = GEP->getDataLayout();
unsigned BitWidth = DL.getIndexTypeSizeInBits(PtrTy);
- MapVector<Value *, APInt> VariableOffsets;
+ SmallMapVector<Value *, APInt, 4> VariableOffsets;
APInt ConstantOffset(BitWidth, 0);
if (GEP->collectOffset(DL, BitWidth, VariableOffsets, ConstantOffset)) {
// Convert into offset representation, to recognize equivalent address
diff --git a/llvm/lib/Transforms/Scalar/JumpTableToSwitch.cpp b/llvm/lib/Transforms/Scalar/JumpTableToSwitch.cpp
index 2a4f68e1252523..7f99cd2060a9d8 100644
--- a/llvm/lib/Transforms/Scalar/JumpTableToSwitch.cpp
+++ b/llvm/lib/Transforms/Scalar/JumpTableToSwitch.cpp
@@ -56,7 +56,7 @@ static std::optional<JumpTableTy> parseJumpTable(GetElementPtrInst *GEP,
const DataLayout &DL = F.getDataLayout();
const unsigned BitWidth =
DL.getIndexSizeInBits(GEP->getPointerAddressSpace());
- MapVector<Value *, APInt> VariableOffsets;
+ SmallMapVector<Value *, APInt, 4> VariableOffsets;
APInt ConstantOffset(BitWidth, 0);
if (!GEP->collectOffset(DL, BitWidth, VariableOffsets, ConstantOffset))
return std::nullopt;
diff --git a/llvm/lib/Transforms/Utils/Local.cpp b/llvm/lib/Transforms/Utils/Local.cpp
index 7659fc69196151..cfe40f91f9a5df 100644
--- a/llvm/lib/Transforms/Utils/Local.cpp
+++ b/llvm/lib/Transforms/Utils/Local.cpp
@@ -925,7 +925,7 @@ CanPropagatePredecessorsForPHIs(BasicBlock *BB, BasicBlock *Succ,
}
using PredBlockVector = SmallVector<BasicBlock *, 16>;
-using IncomingValueMap = DenseMap<BasicBlock *, Value *>;
+using IncomingValueMap = SmallDenseMap<BasicBlock *, Value *, 16>;
/// Determines the value to use as the phi node input for a block.
///
@@ -2467,7 +2467,7 @@ Value *getSalvageOpsForGEP(GetElementPtrInst *GEP, const DataLayout &DL,
SmallVectorImpl<Value *> &AdditionalValues) {
unsigned BitWidth = DL.getIndexSizeInBits(GEP->getPointerAddressSpace());
// Rewrite a GEP into a DIExpression.
- MapVector<Value *, APInt> VariableOffsets;
+ SmallMapVector<Value *, APInt, 4> VariableOffsets;
APInt ConstantOffset(BitWidth, 0);
if (!GEP->collectOffset(DL, BitWidth, VariableOffsets, ConstantOffset))
return nullptr;
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 08e78cb49c69fc..6773f41dd0057d 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -5122,7 +5122,7 @@ LoopVectorizationCostModel::calculateRegisterUsage(ArrayRef<ElementCount> VFs) {
// Each 'key' in the map opens a new interval. The values
// of the map are the index of the 'last seen' usage of the
// instruction that is the key.
- using IntervalMap = DenseMap<Instruction *, unsigned>;
+ using IntervalMap = SmallDenseMap<Instruction *, unsigned, 16>;
// Maps instruction to its index.
SmallVector<Instruction *, 64> IdxToInstr;
@@ -5165,7 +5165,7 @@ LoopVectorizationCostModel::calculateRegisterUsage(ArrayRef<ElementCount> VFs) {
// Saves the list of intervals that end with the index in 'key'.
using InstrList = SmallVector<Instruction *, 2>;
- DenseMap<unsigned, InstrList> TransposeEnds;
+ SmallDenseMap<unsigned, InstrList, 16> TransposeEnds;
// Transpose the EndPoints to a list of values that end at each index.
for (auto &Interval : EndPoint)
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index e45fcb2b5c790c..6bcb4c7af73ab1 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -5470,7 +5470,7 @@ BoUpSLP::getReorderingData(const TreeEntry &TE, bool TopToBottom) {
}
return I1 < I2;
};
- DenseMap<unsigned, unsigned> PhiToId;
+ SmallDenseMap<unsigned, unsigned, 16> PhiToId;
SmallVector<unsigned> Phis(TE.Scalars.size());
std::iota(Phis.begin(), Phis.end(), 0);
OrdersType ResOrder(TE.Scalars.size());
@@ -10311,7 +10311,7 @@ BoUpSLP::getEntryCost(const TreeEntry *E, ArrayRef<Value *> VectorizedVals,
E->isAltShuffle() ? (unsigned)Instruction::ShuffleVector : E->getOpcode();
if (E->CombinedOp != TreeEntry::NotCombinedOp)
ShuffleOrOp = E->CombinedOp;
- SetVector<Value *> UniqueValues(VL.begin(), VL.end());
+ SmallSetVector<Value *, 16> UniqueValues(VL.begin(), VL.end());
const unsigned Sz = UniqueValues.size();
SmallBitVector UsedScalars(Sz, false);
for (unsigned I = 0; I < Sz; ++I) {
@@ -18005,7 +18005,7 @@ class HorizontalReduction {
/// List of possibly reduced values.
SmallVector<SmallVector<Value *>> ReducedVals;
/// Maps reduced value to the corresponding reduction operation.
- DenseMap<Value *, SmallVector<Instruction *>> ReducedValsToOps;
+ SmallDenseMap<Value *, SmallVector<Instruction *>, 16> ReducedValsToOps;
WeakTrackingVH ReductionRoot;
/// The type of reduction operation.
RecurKind RdxKind;
@@ -18374,7 +18374,9 @@ class HorizontalReduction {
// instruction op id and/or alternate op id, plus do extra analysis for
// loads (grouping them by the distabce between pointers) and cmp
// instructions (grouping them by the predicate).
- MapVector<size_t, MapVector<size_t, MapVector<Value *, unsigned>>>
+ SmallMapVector<
+ size_t, SmallMapVector<size_t, SmallMapVector<Value *, unsigned, 2>, 2>,
+ 8>
PossibleReducedVals;
initReductionOps(Root);
DenseMap<Value *, SmallVector<LoadInst *>> LoadsMap;
|
@llvm/pr-subscribers-debuginfo Author: Jeremy Morse (jmorse) ChangesI've scanned for more scenarios where calls to malloc can be avoided through the use of SmallDenseMaps instead of DenseMaps, i.e. initial stack allocations. There are two commits in this PR, one is ordinary replacements, the second involves nested containers. I figured it's worth testing the changes to nested containers as that was identified as a risk in an earlier PR.
Both are positive for CTMark. Most of the changes are fairly obvious, one to highlight is the collectOffset method of GEP instructions: if there's a GEP, of course it's going to have at least one offset, but every time we've called collectOffset we end up calling malloc as well for the DenseMap in the MapVector. Full diff: https://github.com/llvm/llvm-project/pull/110544.diff 17 Files Affected:
diff --git a/llvm/include/llvm/IR/Instructions.h b/llvm/include/llvm/IR/Instructions.h
index 75a059760f48fa..695a7a6aa9f254 100644
--- a/llvm/include/llvm/IR/Instructions.h
+++ b/llvm/include/llvm/IR/Instructions.h
@@ -1117,7 +1117,7 @@ class GetElementPtrInst : public Instruction {
/// the base GEP pointer.
bool accumulateConstantOffset(const DataLayout &DL, APInt &Offset) const;
bool collectOffset(const DataLayout &DL, unsigned BitWidth,
- MapVector<Value *, APInt> &VariableOffsets,
+ SmallMapVector<Value *, APInt, 4> &VariableOffsets,
APInt &ConstantOffset) const;
// Methods for support type inquiry through isa, cast, and dyn_cast:
static bool classof(const Instruction *I) {
diff --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h
index 88b9bfc0be4b15..0e9f6ed35dcb4e 100644
--- a/llvm/include/llvm/IR/Operator.h
+++ b/llvm/include/llvm/IR/Operator.h
@@ -528,7 +528,7 @@ class GEPOperator
/// Collect the offset of this GEP as a map of Values to their associated
/// APInt multipliers, as well as a total Constant Offset.
bool collectOffset(const DataLayout &DL, unsigned BitWidth,
- MapVector<Value *, APInt> &VariableOffsets,
+ SmallMapVector<Value *, APInt, 4> &VariableOffsets,
APInt &ConstantOffset) const;
};
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index 6f211abb299e7a..e4792f7e6ce72e 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -2831,7 +2831,7 @@ static void emitRangeList(
// Gather all the ranges that apply to the same section so they can share
// a base address entry.
- MapVector<const MCSection *, std::vector<decltype(&*R.begin())>> SectionRanges;
+ SmallMapVector<const MCSection *, std::vector<decltype(&*R.begin())>, 16> SectionRanges;
for (const auto &Range : R)
SectionRanges[&Range.Begin->getSection()].push_back(&Range);
diff --git a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
index f157ffc6bcc2d7..68db65ace9a427 100644
--- a/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
+++ b/llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
@@ -1046,7 +1046,7 @@ class VLocTracker {
/// transfer function for this block, as part of the dataflow analysis. The
/// movement of values between locations inside of a block is handled at a
/// much later stage, in the TransferTracker class.
- MapVector<DebugVariableID, DbgValue> Vars;
+ SmallMapVector<DebugVariableID, DbgValue, 8> Vars;
SmallDenseMap<DebugVariableID, const DILocation *, 8> Scopes;
MachineBasicBlock *MBB = nullptr;
const OverlapMap &OverlappingFragments;
@@ -1128,7 +1128,7 @@ class InstrRefBasedLDV : public LDVImpl {
/// Live in/out structure for the variable values: a per-block map of
/// variables to their values.
- using LiveIdxT = DenseMap<const MachineBasicBlock *, DbgValue *>;
+ using LiveIdxT = SmallDenseMap<const MachineBasicBlock *, DbgValue *, 16>;
using VarAndLoc = std::pair<DebugVariableID, DbgValue>;
diff --git a/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp b/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
index 68dece6cf73e91..ca8fac6b437777 100644
--- a/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
+++ b/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
@@ -621,7 +621,7 @@ void ScheduleDAGInstrs::initSUnits() {
}
}
-class ScheduleDAGInstrs::Value2SUsMap : public MapVector<ValueType, SUList> {
+class ScheduleDAGInstrs::Value2SUsMap : public SmallMapVector<ValueType, SUList, 4> {
/// Current total number of SUs in map.
unsigned NumNodes = 0;
@@ -656,7 +656,7 @@ class ScheduleDAGInstrs::Value2SUsMap : public MapVector<ValueType, SUList> {
/// Clears map from all contents.
void clear() {
- MapVector<ValueType, SUList>::clear();
+ SmallMapVector<ValueType, SUList, 4>::clear();
NumNodes = 0;
}
diff --git a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
index e4ee3fd99f16e3..57dbdda68d61b6 100644
--- a/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
@@ -183,7 +183,7 @@ class ScheduleDAGRRList : public ScheduleDAGSDNodes {
// Hack to keep track of the inverse of FindCallSeqStart without more crazy
// DAG crawling.
- DenseMap<SUnit*, SUnit*> CallSeqEndForStart;
+ SmallDenseMap<SUnit*, SUnit*, 16> CallSeqEndForStart;
public:
ScheduleDAGRRList(MachineFunction &mf, bool needlatency,
diff --git a/llvm/lib/IR/Instructions.cpp b/llvm/lib/IR/Instructions.cpp
index e95b98a6404432..009e0c03957c97 100644
--- a/llvm/lib/IR/Instructions.cpp
+++ b/llvm/lib/IR/Instructions.cpp
@@ -1584,7 +1584,7 @@ bool GetElementPtrInst::accumulateConstantOffset(const DataLayout &DL,
bool GetElementPtrInst::collectOffset(
const DataLayout &DL, unsigned BitWidth,
- MapVector<Value *, APInt> &VariableOffsets,
+ SmallMapVector<Value *, APInt, 4> &VariableOffsets,
APInt &ConstantOffset) const {
// Delegate to the generic GEPOperator implementation.
return cast<GEPOperator>(this)->collectOffset(DL, BitWidth, VariableOffsets,
diff --git a/llvm/lib/IR/Operator.cpp b/llvm/lib/IR/Operator.cpp
index 6c9862556f5504..f93ff8f6fc8a25 100644
--- a/llvm/lib/IR/Operator.cpp
+++ b/llvm/lib/IR/Operator.cpp
@@ -201,7 +201,7 @@ bool GEPOperator::accumulateConstantOffset(
bool GEPOperator::collectOffset(
const DataLayout &DL, unsigned BitWidth,
- MapVector<Value *, APInt> &VariableOffsets,
+ SmallMapVector<Value *, APInt, 4> &VariableOffsets,
APInt &ConstantOffset) const {
assert(BitWidth == DL.getIndexSizeInBits(getPointerAddressSpace()) &&
"The offset bit width does not match DL specification.");
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
index 7bd618b2d9660c..24bfbff41ec5c0 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
@@ -402,7 +402,7 @@ static Value *GEPToVectorIndex(GetElementPtrInst *GEP, AllocaInst *Alloca,
// TODO: Extracting a "multiple of X" from a GEP might be a useful generic
// helper.
unsigned BW = DL.getIndexTypeSizeInBits(GEP->getType());
- MapVector<Value *, APInt> VarOffsets;
+ SmallMapVector<Value *, APInt, 4> VarOffsets;
APInt ConstOffset(BW, 0);
if (GEP->getPointerOperand()->stripPointerCasts() != Alloca ||
!GEP->collectOffset(DL, BW, VarOffsets, ConstOffset))
diff --git a/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp b/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
index 01642b0677aba3..9943c3cbb9fc7d 100644
--- a/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
+++ b/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
@@ -843,7 +843,7 @@ getStrideAndModOffsetOfGEP(Value *PtrOp, const DataLayout &DL) {
// Return a minimum gep stride, greatest common divisor of consective gep
// index scales(c.f. Bézout's identity).
while (auto *GEP = dyn_cast<GEPOperator>(PtrOp)) {
- MapVector<Value *, APInt> VarOffsets;
+ SmallMapVector<Value *, APInt, 4> VarOffsets;
if (!GEP->collectOffset(DL, BW, VarOffsets, ModOffset))
break;
diff --git a/llvm/lib/Transforms/IPO/AttributorAttributes.cpp b/llvm/lib/Transforms/IPO/AttributorAttributes.cpp
index 416dd09ca874bf..238bdf9c344b08 100644
--- a/llvm/lib/Transforms/IPO/AttributorAttributes.cpp
+++ b/llvm/lib/Transforms/IPO/AttributorAttributes.cpp
@@ -1557,7 +1557,7 @@ bool AAPointerInfoFloating::collectConstantsForGEP(Attributor &A,
const OffsetInfo &PtrOI,
const GEPOperator *GEP) {
unsigned BitWidth = DL.getIndexTypeSizeInBits(GEP->getType());
- MapVector<Value *, APInt> VariableOffsets;
+ SmallMapVector<Value *, APInt, 4> VariableOffsets;
APInt ConstantOffset(BitWidth, 0);
assert(!UsrOI.isUnknown() && !PtrOI.isUnknown() &&
diff --git a/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp b/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
index 7e2721d0c5a5e6..7c06e0c757e1cc 100644
--- a/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp
@@ -385,7 +385,7 @@ struct Decomposition {
struct OffsetResult {
Value *BasePtr;
APInt ConstantOffset;
- MapVector<Value *, APInt> VariableOffsets;
+ SmallMapVector<Value *, APInt, 4> VariableOffsets;
bool AllInbounds;
OffsetResult() : BasePtr(nullptr), ConstantOffset(0, uint64_t(0)) {}
@@ -410,7 +410,7 @@ static OffsetResult collectOffsets(GEPOperator &GEP, const DataLayout &DL) {
// If we have a nested GEP, check if we can combine the constant offset of the
// inner GEP with the outer GEP.
if (auto *InnerGEP = dyn_cast<GetElementPtrInst>(Result.BasePtr)) {
- MapVector<Value *, APInt> VariableOffsets2;
+ SmallMapVector<Value *, APInt, 4> VariableOffsets2;
APInt ConstantOffset2(BitWidth, 0);
bool CanCollectInner = InnerGEP->collectOffset(
DL, BitWidth, VariableOffsets2, ConstantOffset2);
diff --git a/llvm/lib/Transforms/Scalar/GVN.cpp b/llvm/lib/Transforms/Scalar/GVN.cpp
index db39d8621d0771..2ba600497e00d3 100644
--- a/llvm/lib/Transforms/Scalar/GVN.cpp
+++ b/llvm/lib/Transforms/Scalar/GVN.cpp
@@ -422,7 +422,7 @@ GVNPass::Expression GVNPass::ValueTable::createGEPExpr(GetElementPtrInst *GEP) {
Type *PtrTy = GEP->getType()->getScalarType();
const DataLayout &DL = GEP->getDataLayout();
unsigned BitWidth = DL.getIndexTypeSizeInBits(PtrTy);
- MapVector<Value *, APInt> VariableOffsets;
+ SmallMapVector<Value *, APInt, 4> VariableOffsets;
APInt ConstantOffset(BitWidth, 0);
if (GEP->collectOffset(DL, BitWidth, VariableOffsets, ConstantOffset)) {
// Convert into offset representation, to recognize equivalent address
diff --git a/llvm/lib/Transforms/Scalar/JumpTableToSwitch.cpp b/llvm/lib/Transforms/Scalar/JumpTableToSwitch.cpp
index 2a4f68e1252523..7f99cd2060a9d8 100644
--- a/llvm/lib/Transforms/Scalar/JumpTableToSwitch.cpp
+++ b/llvm/lib/Transforms/Scalar/JumpTableToSwitch.cpp
@@ -56,7 +56,7 @@ static std::optional<JumpTableTy> parseJumpTable(GetElementPtrInst *GEP,
const DataLayout &DL = F.getDataLayout();
const unsigned BitWidth =
DL.getIndexSizeInBits(GEP->getPointerAddressSpace());
- MapVector<Value *, APInt> VariableOffsets;
+ SmallMapVector<Value *, APInt, 4> VariableOffsets;
APInt ConstantOffset(BitWidth, 0);
if (!GEP->collectOffset(DL, BitWidth, VariableOffsets, ConstantOffset))
return std::nullopt;
diff --git a/llvm/lib/Transforms/Utils/Local.cpp b/llvm/lib/Transforms/Utils/Local.cpp
index 7659fc69196151..cfe40f91f9a5df 100644
--- a/llvm/lib/Transforms/Utils/Local.cpp
+++ b/llvm/lib/Transforms/Utils/Local.cpp
@@ -925,7 +925,7 @@ CanPropagatePredecessorsForPHIs(BasicBlock *BB, BasicBlock *Succ,
}
using PredBlockVector = SmallVector<BasicBlock *, 16>;
-using IncomingValueMap = DenseMap<BasicBlock *, Value *>;
+using IncomingValueMap = SmallDenseMap<BasicBlock *, Value *, 16>;
/// Determines the value to use as the phi node input for a block.
///
@@ -2467,7 +2467,7 @@ Value *getSalvageOpsForGEP(GetElementPtrInst *GEP, const DataLayout &DL,
SmallVectorImpl<Value *> &AdditionalValues) {
unsigned BitWidth = DL.getIndexSizeInBits(GEP->getPointerAddressSpace());
// Rewrite a GEP into a DIExpression.
- MapVector<Value *, APInt> VariableOffsets;
+ SmallMapVector<Value *, APInt, 4> VariableOffsets;
APInt ConstantOffset(BitWidth, 0);
if (!GEP->collectOffset(DL, BitWidth, VariableOffsets, ConstantOffset))
return nullptr;
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 08e78cb49c69fc..6773f41dd0057d 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -5122,7 +5122,7 @@ LoopVectorizationCostModel::calculateRegisterUsage(ArrayRef<ElementCount> VFs) {
// Each 'key' in the map opens a new interval. The values
// of the map are the index of the 'last seen' usage of the
// instruction that is the key.
- using IntervalMap = DenseMap<Instruction *, unsigned>;
+ using IntervalMap = SmallDenseMap<Instruction *, unsigned, 16>;
// Maps instruction to its index.
SmallVector<Instruction *, 64> IdxToInstr;
@@ -5165,7 +5165,7 @@ LoopVectorizationCostModel::calculateRegisterUsage(ArrayRef<ElementCount> VFs) {
// Saves the list of intervals that end with the index in 'key'.
using InstrList = SmallVector<Instruction *, 2>;
- DenseMap<unsigned, InstrList> TransposeEnds;
+ SmallDenseMap<unsigned, InstrList, 16> TransposeEnds;
// Transpose the EndPoints to a list of values that end at each index.
for (auto &Interval : EndPoint)
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index e45fcb2b5c790c..6bcb4c7af73ab1 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -5470,7 +5470,7 @@ BoUpSLP::getReorderingData(const TreeEntry &TE, bool TopToBottom) {
}
return I1 < I2;
};
- DenseMap<unsigned, unsigned> PhiToId;
+ SmallDenseMap<unsigned, unsigned, 16> PhiToId;
SmallVector<unsigned> Phis(TE.Scalars.size());
std::iota(Phis.begin(), Phis.end(), 0);
OrdersType ResOrder(TE.Scalars.size());
@@ -10311,7 +10311,7 @@ BoUpSLP::getEntryCost(const TreeEntry *E, ArrayRef<Value *> VectorizedVals,
E->isAltShuffle() ? (unsigned)Instruction::ShuffleVector : E->getOpcode();
if (E->CombinedOp != TreeEntry::NotCombinedOp)
ShuffleOrOp = E->CombinedOp;
- SetVector<Value *> UniqueValues(VL.begin(), VL.end());
+ SmallSetVector<Value *, 16> UniqueValues(VL.begin(), VL.end());
const unsigned Sz = UniqueValues.size();
SmallBitVector UsedScalars(Sz, false);
for (unsigned I = 0; I < Sz; ++I) {
@@ -18005,7 +18005,7 @@ class HorizontalReduction {
/// List of possibly reduced values.
SmallVector<SmallVector<Value *>> ReducedVals;
/// Maps reduced value to the corresponding reduction operation.
- DenseMap<Value *, SmallVector<Instruction *>> ReducedValsToOps;
+ SmallDenseMap<Value *, SmallVector<Instruction *>, 16> ReducedValsToOps;
WeakTrackingVH ReductionRoot;
/// The type of reduction operation.
RecurKind RdxKind;
@@ -18374,7 +18374,9 @@ class HorizontalReduction {
// instruction op id and/or alternate op id, plus do extra analysis for
// loads (grouping them by the distabce between pointers) and cmp
// instructions (grouping them by the predicate).
- MapVector<size_t, MapVector<size_t, MapVector<Value *, unsigned>>>
+ SmallMapVector<
+ size_t, SmallMapVector<size_t, SmallMapVector<Value *, unsigned, 2>, 2>,
+ 8>
PossibleReducedVals;
initReductionOps(Root);
DenseMap<Value *, SmallVector<LoadInst *>> LoadsMap;
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…0544) This replaces some of the most frequent offenders of using a DenseMap that cause a malloc, where the typical element-count is small enough to fit in an initial stack allocation. Most of these are fairly obvious, one to highlight is the collectOffset method of GEP instructions: if there's a GEP, of course it's going to have at least one offset, but every time we've called collectOffset we end up calling malloc as well for the DenseMap in the MapVector.
…0544) This replaces some of the most frequent offenders of using a DenseMap that cause a malloc, where the typical element-count is small enough to fit in an initial stack allocation. Most of these are fairly obvious, one to highlight is the collectOffset method of GEP instructions: if there's a GEP, of course it's going to have at least one offset, but every time we've called collectOffset we end up calling malloc as well for the DenseMap in the MapVector.
…0544) This replaces some of the most frequent offenders of using a DenseMap that cause a malloc, where the typical element-count is small enough to fit in an initial stack allocation. Most of these are fairly obvious, one to highlight is the collectOffset method of GEP instructions: if there's a GEP, of course it's going to have at least one offset, but every time we've called collectOffset we end up calling malloc as well for the DenseMap in the MapVector.
…0544) This replaces some of the most frequent offenders of using a DenseMap that cause a malloc, where the typical element-count is small enough to fit in an initial stack allocation. Most of these are fairly obvious, one to highlight is the collectOffset method of GEP instructions: if there's a GEP, of course it's going to have at least one offset, but every time we've called collectOffset we end up calling malloc as well for the DenseMap in the MapVector.
I've scanned for more scenarios where calls to malloc can be avoided through the use of SmallDenseMaps instead of DenseMaps, i.e. initial stack allocations. There are two commits in this PR, one is ordinary replacements, the second involves nested containers. I figured it's worth testing the changes to nested containers as that was identified as a risk in an earlier PR.
Both are positive for CTMark. Most of the changes are fairly obvious, one to highlight is the collectOffset method of GEP instructions: if there's a GEP, of course it's going to have at least one offset, but every time we've called collectOffset we end up calling malloc as well for the DenseMap in the MapVector.