Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FunctionAttrs] Add the "initializes" attribute inference #97373

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
349 changes: 346 additions & 3 deletions llvm/lib/Transforms/IPO/FunctionAttrs.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
#include "llvm/Transforms/IPO/FunctionAttrs.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/PostOrderIterator.h"
#include "llvm/ADT/SCCIterator.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SetVector.h"
Expand All @@ -36,6 +37,7 @@
#include "llvm/IR/Attributes.h"
#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Constant.h"
#include "llvm/IR/ConstantRangeList.h"
#include "llvm/IR/Constants.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/InstIterator.h"
Expand Down Expand Up @@ -580,6 +582,208 @@ struct ArgumentUsesTracker : public CaptureTracker {
const SCCNodeSet &SCCNodes;
};

// A struct of argument use: a Use and the offset it accesses. This struct
// is to track uses inside function via GEP. If GEP has a non-constant index,
// the Offset field is nullopt.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use doxygen comments (///) for classes.

struct ArgumentUse {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reminding! Done.

Use *U;
std::optional<int64_t> Offset;
};

// A struct of argument access info. "Unknown" accesses are the cases like
// unrecognized instructions, instructions that have more than one use of
// the argument, or volatile memory accesses. "Unknown" implies "IsClobber"
// and an empty access range.
// Write or Read accesses can be clobbers as well for example, a Load with
// scalable type.
struct ArgumentAccessInfo {
enum class AccessType : uint8_t { Write, Read, Unknown };
AccessType ArgAccessType;
bool IsClobber = false;
jvoung marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "clobber" terminology here is pretty weird, because usually "clobber" means "write", while here it's more like "read".

Also, after looking at the code, I'm a bit unclear on whether we really need both AccessType::Unknown and IsClobber. Isn't one of them sufficient?

ConstantRangeList AccessRanges;
};

struct UsesPerBlockInfo {
SmallDenseMap<Instruction *, ArgumentAccessInfo, 4> Insts;
bool HasWrites = false;
bool HasClobber = false;
};

ArgumentAccessInfo GetArgmentAccessInfo(const Instruction *I,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ArgumentAccessInfo GetArgmentAccessInfo(const Instruction *I,
ArgumentAccessInfo getArgmentAccessInfo(const Instruction *I,

const ArgumentUse &ArgUse,
const DataLayout &DL) {
auto GetTypeAccessRange =
[&DL](Type *Ty,
std::optional<int64_t> Offset) -> std::optional<ConstantRange> {
auto TypeSize = DL.getTypeStoreSize(Ty);
if (!TypeSize.isScalable() && Offset) {
int64_t Size = TypeSize.getFixedValue();
return ConstantRange(APInt(64, *Offset, true),
APInt(64, *Offset + Size, true));
}
return std::nullopt;
};
auto GetConstantIntRange =
[](Value *Length,
std::optional<int64_t> Offset) -> std::optional<ConstantRange> {
auto *ConstantLength = dyn_cast<ConstantInt>(Length);
if (ConstantLength && Offset)
return ConstantRange(
APInt(64, *Offset, true),
APInt(64, *Offset + ConstantLength->getSExtValue(), true));
return std::nullopt;
};
if (auto *SI = dyn_cast<StoreInst>(I)) {
aeubanks marked this conversation as resolved.
Show resolved Hide resolved
if (!SI->isVolatile() && &SI->getOperandUse(1) == ArgUse.U) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These isVolatile() checks should probably be isSimple() checks? Presumably you don't want to handle atomics.

// Get the fixed type size of "SI". Since the access range of a write
// will be unioned, if "SI" doesn't have a fixed type size, we just set
// the access range to empty.
ConstantRangeList AccessRanges;
if (auto TypeAccessRange =
GetTypeAccessRange(SI->getAccessType(), ArgUse.Offset))
AccessRanges.insert(*TypeAccessRange);
return {ArgumentAccessInfo::AccessType::Write,
/*IsClobber=*/false, AccessRanges};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/*IsClobber=*/false, AccessRanges};
/*IsClobber=*/false, std::move(AccessRanges)};

}
} else if (auto *LI = dyn_cast<LoadInst>(I)) {
if (!LI->isVolatile() && &LI->getOperandUse(0) == ArgUse.U) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a load only has one operand, so the &LI->getOperandUse(0) == ArgUse.U shouldn't be necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes! Moved it to an assert.

// Get the fixed type size of "LI". Different from Write, if "LI"
// doesn't have a fixed type size, we conservatively set as a clobber
// with an empty access range.
if (auto TypeAccessRange =
GetTypeAccessRange(LI->getAccessType(), ArgUse.Offset))
return {ArgumentAccessInfo::AccessType::Read,
/*IsClobber=*/false,
{*TypeAccessRange}};
return {ArgumentAccessInfo::AccessType::Read, /*IsClobber=*/true, {}};
}
} else if (auto *MemSet = dyn_cast<MemSetInst>(I)) {
if (!MemSet->isVolatile()) {
ConstantRangeList AccessRanges;
if (auto AccessRange =
GetConstantIntRange(MemSet->getLength(), ArgUse.Offset))
AccessRanges.insert(*AccessRange);
return {ArgumentAccessInfo::AccessType::Write,
/*IsClobber=*/false, AccessRanges};
}
} else if (auto *MTI = dyn_cast<MemTransferInst>(I)) {
if (!MTI->isVolatile()) {
if (&MTI->getOperandUse(0) == ArgUse.U) {
ConstantRangeList AccessRanges;
if (auto AccessRange =
GetConstantIntRange(MTI->getLength(), ArgUse.Offset))
AccessRanges.insert(*AccessRange);
return {ArgumentAccessInfo::AccessType::Write,
/*IsClobber=*/false, AccessRanges};
} else if (&MTI->getOperandUse(1) == ArgUse.U) {
if (auto AccessRange =
GetConstantIntRange(MTI->getLength(), ArgUse.Offset))
return {ArgumentAccessInfo::AccessType::Read,
/*IsClobber=*/false,
{*AccessRange}};
return {ArgumentAccessInfo::AccessType::Read, /*IsClobber=*/true, {}};
}
}
} else if (auto *CB = dyn_cast<CallBase>(I)) {
if (CB->isArgOperand(ArgUse.U)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add a test where the argument is passed as an operand bundle to a call, rather than an argument

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

unsigned ArgNo = CB->getArgOperandNo(ArgUse.U);
bool IsInitialize = CB->paramHasAttr(ArgNo, Attribute::Initializes);
// Argument is only not clobbered when parameter is writeonly/readnone
// and nocapture.
bool IsClobber = !(CB->onlyWritesMemory(ArgNo) &&
CB->paramHasAttr(ArgNo, Attribute::NoCapture));
ConstantRangeList AccessRanges;
if (IsInitialize && ArgUse.Offset) {
Attribute Attr = CB->getParamAttr(ArgNo, Attribute::Initializes);
if (!Attr.isValid())
Attr = CB->getCalledFunction()->getParamAttribute(
ArgNo, Attribute::Initializes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after a620697 this is no longer necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Done.

ConstantRangeList CBCRL = Attr.getValueAsConstantRangeList();
for (ConstantRange &CR : CBCRL)
AccessRanges.insert(ConstantRange(CR.getLower() + *ArgUse.Offset,
CR.getUpper() + *ArgUse.Offset));
return {ArgumentAccessInfo::AccessType::Write, IsClobber, AccessRanges};
}
}
}
// Unrecognized instructions are considered clobbers.
return {ArgumentAccessInfo::AccessType::Unknown, /*IsClobber=*/true, {}};
}

// Collect the uses of argument "A" in "F" and store the uses info per block to
// "UsesPerBlock". Return a pair of bool that indicate whether there is any
// write access, and whether there is any write access outside of the entry
// block in "F", which will be used to simplify the inference for simple cases.
std::pair<bool, bool> CollectArgumentUsesPerBlock(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some doc for the function? What does the return value represent? UsesPerBlock updated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this return UsesPerBlock along with the two bools, rather than using an out parameter? and maybe change the return type to a struct with nice names rather than std::pair/tuple

Argument &A, Function &F,
SmallDenseMap<const BasicBlock *, UsesPerBlockInfo, 16> &UsesPerBlock) {
auto &DL = F.getParent()->getDataLayout();
auto PointerSize =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto PointerSize =
unsigned PointerSize =

DL.getIndexSizeInBits(A.getType()->getPointerAddressSpace());

bool HasAnyWrite = false;
bool HasWriteOutsideEntryBB = false;

BasicBlock &EntryBB = F.getEntryBlock();
SmallVector<ArgumentUse, 4> Worklist;
for (Use &U : A.uses())
Worklist.push_back({&U, 0});

// Update "UsesPerBlock" with the block of "I" as key and "Info" as value.
// Return true if the block of "I" has write accesses after updating.
auto UpdateUseInfo = [&UsesPerBlock](Instruction *I,
jvoung marked this conversation as resolved.
Show resolved Hide resolved
ArgumentAccessInfo Info) {
auto *BB = I->getParent();
auto &BBInfo = UsesPerBlock.getOrInsertDefault(BB);
jvoung marked this conversation as resolved.
Show resolved Hide resolved
bool AlreadyVisitedInst = BBInfo.Insts.contains(I);
auto &IInfo = BBInfo.Insts[I];

// Instructions that have more than one use of the argument are considered
// as clobbers.
if (AlreadyVisitedInst) {
IInfo = {ArgumentAccessInfo::AccessType::Unknown, /*IsClobber=*/true, {}};
BBInfo.HasClobber = true;
return false;
}

IInfo = Info;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
IInfo = Info;
IInfo = std::move(Info);

BBInfo.HasClobber |= IInfo.IsClobber;
bool InfoHasWrites =
IInfo.ArgAccessType == ArgumentAccessInfo::AccessType::Write &&
!IInfo.AccessRanges.empty();
BBInfo.HasWrites |= InfoHasWrites;
return InfoHasWrites;
};

// No need for a visited set because we don't look through phis, so there are
// no cycles.
while (!Worklist.empty()) {
ArgumentUse ArgUse = Worklist.pop_back_val();
User *U = ArgUse.U->getUser();
// Add GEP uses to worklist.
// If the GEP is not a constant GEP, set the ArgumentUse::Offset to nullopt.
if (auto *GEP = dyn_cast<GEPOperator>(U)) {
APInt Offset(PointerSize, 0, /*isSigned=*/true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
APInt Offset(PointerSize, 0, /*isSigned=*/true);
APInt Offset(PointerSize, 0);

The flag is not meaningful for a zero value.

bool IsConstGEP = GEP->accumulateConstantOffset(DL, Offset);
std::optional<int64_t> NewOffset = std::nullopt;
if (IsConstGEP && ArgUse.Offset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ArgUse.Offset is already nullopt, you can skip the accumulateConstantOffset() call.

NewOffset = *ArgUse.Offset + Offset.getSExtValue();
for (Use &U : GEP->uses())
Worklist.push_back({&U, NewOffset});
continue;
}

auto *I = cast<Instruction>(U);
bool HasWrite = UpdateUseInfo(I, GetArgmentAccessInfo(I, ArgUse, DL));

HasAnyWrite |= HasWrite;

if (HasWrite && I->getParent() != &EntryBB)
HasWriteOutsideEntryBB = true;
}
return {HasAnyWrite, HasWriteOutsideEntryBB};
}

} // end anonymous namespace

namespace llvm {
Expand Down Expand Up @@ -866,9 +1070,124 @@ static bool addAccessAttr(Argument *A, Attribute::AttrKind R) {
return true;
}

static bool inferInitializes(Argument &A, Function &F) {
SmallDenseMap<const BasicBlock *, UsesPerBlockInfo, 16> UsesPerBlock;
auto [HasAnyWrite, HasWriteOutsideEntryBB] =
CollectArgumentUsesPerBlock(A, F, UsesPerBlock);
// No write anywhere in the function, bail.
if (!HasAnyWrite)
return false;

BasicBlock &EntryBB = F.getEntryBlock();
DenseMap<const BasicBlock *, ConstantRangeList> Initialized;
auto VisitBlock = [&](const BasicBlock *BB) -> ConstantRangeList {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment on what VisitBlock does

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done and found a corner case :-)

"If this block has uses and none are writes, the argument is not initialized in this block."
Removed this early return. See the test.

auto UPB = UsesPerBlock.find(BB);

// If this block has uses and none are writes, the argument is not
// initialized in this block.
if (UPB != UsesPerBlock.end() && !UPB->second.HasWrites)
return ConstantRangeList();

ConstantRangeList CRL;

// Start with intersection of successors.
// If this block has any clobbering use, we're going to clear out the
// ranges at some point in this block anyway, so don't bother looking at
// successors.
if (UPB == UsesPerBlock.end() || !UPB->second.HasClobber) {
bool HasAddedSuccessor = false;
for (auto *Succ : successors(BB)) {
if (auto SuccI = Initialized.find(Succ); SuccI != Initialized.end()) {
if (HasAddedSuccessor) {
CRL = CRL.intersectWith(SuccI->second);
} else {
CRL = SuccI->second;
HasAddedSuccessor = true;
}
} else {
CRL = ConstantRangeList();
break;
}
}
}

if (UPB != UsesPerBlock.end()) {
// Sort uses in this block by instruction order.
SmallVector<std::pair<Instruction *, ArgumentAccessInfo>, 2> Insts;
append_range(Insts, UPB->second.Insts);
sort(Insts, [](std::pair<Instruction *, ArgumentAccessInfo> &LHS,
std::pair<Instruction *, ArgumentAccessInfo> &RHS) {
return LHS.first->comesBefore(RHS.first);
});

// From the end of the block to the beginning of the block, set
// initializes ranges.
for (auto &[_, Info] : reverse(Insts)) {
if (Info.IsClobber)
CRL = ConstantRangeList();
if (!Info.AccessRanges.empty()) {
if (Info.ArgAccessType == ArgumentAccessInfo::AccessType::Write) {
CRL = CRL.unionWith(Info.AccessRanges);
} else {
assert(Info.ArgAccessType == ArgumentAccessInfo::AccessType::Read);
for (const auto &ReadRange : Info.AccessRanges)
CRL.subtract(ReadRange);
}
}
}
}
return CRL;
};

ConstantRangeList EntryCRL;
// If all write instructions are in the EntryBB, or if the EntryBB has
// a clobbering use, we only need to look at EntryBB.
bool OnlyScanEntryBlock = !HasWriteOutsideEntryBB;
if (!OnlyScanEntryBlock)
if (auto EntryUPB = UsesPerBlock.find(&EntryBB);
EntryUPB != UsesPerBlock.end())
OnlyScanEntryBlock = EntryUPB->second.HasClobber;
if (OnlyScanEntryBlock) {
EntryCRL = VisitBlock(&EntryBB);
if (EntryCRL.empty())
return false;
} else {
// Visit successors before predecessors with a post-order walk of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you expand the comment to include the reasoning behind this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

// blocks.
for (const BasicBlock *BB : post_order(&F)) {
ConstantRangeList CRL = VisitBlock(BB);
if (!CRL.empty())
Initialized[BB] = CRL;
}

auto EntryCRLI = Initialized.find(&EntryBB);
if (EntryCRLI == Initialized.end())
return false;

EntryCRL = EntryCRLI->second;
}

assert(!EntryCRL.empty() &&
"should have bailed already if EntryCRL is empty");

if (A.hasAttribute(Attribute::Initializes)) {
ConstantRangeList PreviousCRL =
A.getAttribute(Attribute::Initializes).getValueAsConstantRangeList();
if (PreviousCRL == EntryCRL)
return false;
EntryCRL = EntryCRL.unionWith(PreviousCRL);
}

A.addAttr(Attribute::get(A.getContext(), Attribute::Initializes,
EntryCRL.rangesRef()));

return true;
}

/// Deduce nocapture attributes for the SCC.
static void addArgumentAttrs(const SCCNodeSet &SCCNodes,
SmallSet<Function *, 8> &Changed) {
SmallSet<Function *, 8> &Changed,
bool SkipInitializes) {
ArgumentGraph AG;

// Check each function in turn, determining which pointer arguments are not
Expand Down Expand Up @@ -936,6 +1255,10 @@ static void addArgumentAttrs(const SCCNodeSet &SCCNodes,
if (addAccessAttr(&A, R))
Changed.insert(F);
}
if (!SkipInitializes && !A.onlyReadsMemory()) {
if (inferInitializes(A, *F))
Changed.insert(F);
}
}
}

Expand Down Expand Up @@ -1844,13 +2167,33 @@ deriveAttrsInPostOrder(ArrayRef<Function *> Functions, AARGetterT &&AARGetter,

SmallSet<Function *, 8> Changed;
if (ArgAttrsOnly) {
addArgumentAttrs(Nodes.SCCNodes, Changed);
// To get precise function attributes fastly, the main postorder CGSCC
// pipeline runs PostOrderFunctionAttrsPass twice, and the function
// simplification pipeline is scheduled in the middle.
//
// The first run deduces function attributes that could affect the function
// simplification pipeline, which is only the case with recursive functions.
// For non-recursive functions, it only infers argument attributes.
// The second run deduces any function attributes based on the fully
// simplified function
//
// PostOrderFunctionAttrsPass operates the call graph in "bottom-up" way:
// PostOrderFunctionAttrsPass(callee, ArgAttrsOnly) ->
// FunctionSimplificationPipeline {DSE(callee), ...} ->
// PostOrderFunctionAttrsPass2(callee) ->
// PostOrderFunctionAttrsPass(caller, ArgAttrsOnly) ->
// FunctionSimplificationPipeline {DSE(caller), ...} ->
// PostOrderFunctionAttrsPass2(caller)
// Only infer the "initializes" attribute in the 2nd run to get a precise
// attribute of callee which would be used to simplify callers in the
// function simplification pipeline (like DSE).
addArgumentAttrs(Nodes.SCCNodes, Changed, /*SkipInitializes=*/true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment as to why SkipInitializes in the ArgAttrsOnly case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done :-D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would simplify the comment a bit, this is too much detail. The important thing is that ArgAttrsOnly means to only infer attributes that may aid optimizations on the current function. initializes does not fall into that category.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

return Changed;
}

addArgumentReturnedAttrs(Nodes.SCCNodes, Changed);
addMemoryAttrs(Nodes.SCCNodes, AARGetter, Changed);
addArgumentAttrs(Nodes.SCCNodes, Changed);
addArgumentAttrs(Nodes.SCCNodes, Changed, /*SkipInitializes=*/false);
inferConvergent(Nodes.SCCNodes, Changed);
addNoReturnAttrs(Nodes.SCCNodes, Changed);
addWillReturn(Nodes.SCCNodes, Changed);
Expand Down
Loading
Loading