Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Coro] Amortize debug info processing cost in CoroSplit #109032

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
15 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions llvm/include/llvm/Transforms/Utils/Cloning.h
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,12 @@ void CloneFunctionInto(Function *NewFunc, const Function *OldFunc,
ValueMapTypeRemapper *TypeMapper = nullptr,
ValueMaterializer *Materializer = nullptr);

void CloneFunctionAttributesInto(Function *NewFunc, const Function *OldFunc,
ValueToValueMapTy &VMap,
bool ModuleLevelChanges,
ValueMapTypeRemapper *TypeMapper = nullptr,
ValueMaterializer *Materializer = nullptr);

void CloneAndPruneIntoFromInst(Function *NewFunc, const Function *OldFunc,
const Instruction *StartingInst,
ValueToValueMapTy &VMap, bool ModuleLevelChanges,
Expand All @@ -199,6 +205,11 @@ void CloneAndPruneFunctionInto(Function *NewFunc, const Function *OldFunc,
const char *NameSuffix = "",
ClonedCodeInfo *CodeInfo = nullptr);

/// Process debug information from function's subprogram attachment.
DISubprogram *ProcessSubprogramAttachment(const Function &F,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on "Process"? This is more about collecting debug info, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. This function hydrates the passed DebugInfoFinder by visiting relevant components of the function's subprogram attachment. I'll update the name/comment when we get to extracting that commit in a separate PR.

Copy link
Contributor

@felipepiovezan felipepiovezan Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, now that the other two PRs are open & approved, let's proceed with this commit!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, yes! @felipepiovezan a silly question -- now that those PRs are approved, how/when do they get merged? Stacking unmerged PRs on top of each other doesn't have the best UX in Github.

Copy link
Contributor

@felipepiovezan felipepiovezan Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the best option here is to merge the two PRs that are approved (#112976 and #112948) . Both of those stand on their own and don't depend on each other, so it's fine to merge in w/e order, right?

And then you can grab the third commit (or more, if the subsequent commits are also independent of each other) of this PR and open a new PR (or multiple PRs, if you did the () before), so it will look fine on top of main. My understanding is that the fourth commit depends on the third though.

Regarding what to do about this umbrella PR in the mean time, you have two options: leave this as is (with the understanding that it is not meant to be merged, merely serve as a reference for all the work in the proposal), or rebase on top of main, which will get rid of the two-already merged commits and force push to your branch artempyanykh:fast-coro-upstream, which will mean this PR serves a reference of the work that's left to be merged, also with the understand that we're not merging it.

With all that stack, we'd never have to stack unmerged PRs on top of each other, does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, do you have commit access?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the best option here is to merge the two PRs that are approved (#112976 and #112948) . Both of those stand on their own and don't depend on each other, so it's fine to merge in w/e order, right?

That's right, yes.

And then you can grab the third commit (or more, if the subsequent commits are also independent of each other) of this PR and open a new PR (or multiple PRs, if you did the () before), so it will look fine on top of main. My understanding is that the fourth commit depends on the third though.

After the first 2 are merged, extracting the third one in a separate PR is straightforward. With the rest of the stack there's quite a bit of sequencing, so we may need to rinse and repeat.

rebase on top of main, which will get rid of the two-already merged commits and force push to your branch artempyanykh:fast-coro-upstream, which will mean this PR serves a reference of the work that's left to be merged, also with the understand that we're not merging it

+1 to this. I was planning to keep rebasing this stack anyway to make sure things keep working end-to-end while the chunks are being extracted and merged.

By the way, do you have commit access?

No, not sure what the process for this is. Why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, the LLVM dev conference happened last week and we were mostly out.

No, not sure what the process for this is. Why?

Just wondering if you needed help merging the open PRs. I've merged the first two for you.
The process for getting commit access is described here: https://llvm.org/docs/DeveloperPolicy.html#obtaining-commit-access
If you don't want to do that, you can always let the reviewers know you need someone to press the merge button for you

CloneFunctionChangeType Changes,
DebugInfoFinder &DIFinder);

/// This class captures the data input to the InlineFunction call, and records
/// the auxiliary results produced by it.
class InlineFunctionInfo {
Expand Down
58 changes: 38 additions & 20 deletions llvm/lib/Transforms/Coroutines/CoroSplit.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@
#include "llvm/Support/Casting.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/PrettyStackTrace.h"
#include "llvm/Support/TimeProfiler.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Utils/BasicBlockUtils.h"
Expand Down Expand Up @@ -116,7 +117,6 @@ class CoroCloner {

TargetTransformInfo &TTI;

public:
/// Create a cloner for a switch lowering.
CoroCloner(Function &OrigF, const Twine &Suffix, coro::Shape &Shape,
Kind FKind, TargetTransformInfo &TTI)
Expand All @@ -138,6 +138,30 @@ class CoroCloner {
assert(ActiveSuspend && "need active suspend point for continuation");
}

public:
/// Create a clone for a switch lowering.
static Function *createClone(Function &OrigF, const Twine &Suffix,
coro::Shape &Shape, Kind FKind,
TargetTransformInfo &TTI) {
TimeTraceScope FunctionScope("CoroCloner");

CoroCloner Cloner(OrigF, Suffix, Shape, FKind, TTI);
Cloner.create();
return Cloner.getFunction();
}

/// Create a clone for a continuation lowering.
static Function *createClone(Function &OrigF, const Twine &Suffix,
coro::Shape &Shape, Function *NewF,
AnyCoroSuspendInst *ActiveSuspend,
TargetTransformInfo &TTI) {
TimeTraceScope FunctionScope("CoroCloner");

CoroCloner Cloner(OrigF, Suffix, Shape, NewF, ActiveSuspend, TTI);
Cloner.create();
return Cloner.getFunction();
}

Function *getFunction() const {
assert(NewF != nullptr && "declaration not yet set");
return NewF;
Expand Down Expand Up @@ -1464,13 +1488,16 @@ struct SwitchCoroutineSplitter {
TargetTransformInfo &TTI) {
assert(Shape.ABI == coro::ABI::Switch);

// Create a resume clone by cloning the body of the original function,
// setting new entry block and replacing coro.suspend an appropriate value
// to force resume or cleanup pass for every suspend point.
createResumeEntryBlock(F, Shape);
auto *ResumeClone =
createClone(F, ".resume", Shape, CoroCloner::Kind::SwitchResume, TTI);
auto *DestroyClone =
createClone(F, ".destroy", Shape, CoroCloner::Kind::SwitchUnwind, TTI);
auto *CleanupClone =
createClone(F, ".cleanup", Shape, CoroCloner::Kind::SwitchCleanup, TTI);
auto *ResumeClone = CoroCloner::createClone(
F, ".resume", Shape, CoroCloner::Kind::SwitchResume, TTI);
auto *DestroyClone = CoroCloner::createClone(
F, ".destroy", Shape, CoroCloner::Kind::SwitchUnwind, TTI);
auto *CleanupClone = CoroCloner::createClone(
F, ".cleanup", Shape, CoroCloner::Kind::SwitchCleanup, TTI);

postSplitCleanup(*ResumeClone);
postSplitCleanup(*DestroyClone);
Expand Down Expand Up @@ -1560,17 +1587,6 @@ struct SwitchCoroutineSplitter {
}

private:
// Create a resume clone by cloning the body of the original function, setting
// new entry block and replacing coro.suspend an appropriate value to force
// resume or cleanup pass for every suspend point.
static Function *createClone(Function &F, const Twine &Suffix,
coro::Shape &Shape, CoroCloner::Kind FKind,
TargetTransformInfo &TTI) {
CoroCloner Cloner(F, Suffix, Shape, FKind, TTI);
Cloner.create();
return Cloner.getFunction();
}

// Create an entry block for a resume function with a switch that will jump to
// suspend points.
static void createResumeEntryBlock(Function &F, coro::Shape &Shape) {
Expand Down Expand Up @@ -1870,7 +1886,8 @@ static void splitAsyncCoroutine(Function &F, coro::Shape &Shape,
auto *Suspend = Shape.CoroSuspends[Idx];
auto *Clone = Clones[Idx];

CoroCloner(F, "resume." + Twine(Idx), Shape, Clone, Suspend, TTI).create();
CoroCloner::createClone(F, "resume." + Twine(Idx), Shape, Clone, Suspend,
TTI);
}
}

Expand Down Expand Up @@ -1999,7 +2016,8 @@ static void splitRetconCoroutine(Function &F, coro::Shape &Shape,
auto Suspend = Shape.CoroSuspends[i];
auto Clone = Clones[i];

CoroCloner(F, "resume." + Twine(i), Shape, Clone, Suspend, TTI).create();
CoroCloner::createClone(F, "resume." + Twine(i), Shape, Clone, Suspend,
TTI);
}
}

Expand Down
111 changes: 70 additions & 41 deletions llvm/lib/Transforms/Utils/CloneFunction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -86,28 +86,14 @@ BasicBlock *llvm::CloneBasicBlock(const BasicBlock *BB, ValueToValueMapTy &VMap,
return NewBB;
}

// Clone OldFunc into NewFunc, transforming the old arguments into references to
// VMap values.
//
void llvm::CloneFunctionInto(Function *NewFunc, const Function *OldFunc,
ValueToValueMapTy &VMap,
CloneFunctionChangeType Changes,
SmallVectorImpl<ReturnInst *> &Returns,
const char *NameSuffix, ClonedCodeInfo *CodeInfo,
ValueMapTypeRemapper *TypeMapper,
ValueMaterializer *Materializer) {
NewFunc->setIsNewDbgInfoFormat(OldFunc->IsNewDbgInfoFormat);
assert(NameSuffix && "NameSuffix cannot be null!");

#ifndef NDEBUG
for (const Argument &I : OldFunc->args())
assert(VMap.count(&I) && "No mapping from source argument specified!");
#endif

bool ModuleLevelChanges = Changes > CloneFunctionChangeType::LocalChangesOnly;

// Copy all attributes other than those stored in the AttributeList. We need
// to remap the parameter indices of the AttributeList.
// Copy all attributes other than those stored in the AttributeList. We need
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this comment is making a lot of sense as a function-level comment. To be honest even the original comment was confusing: which AttributeList? If you know the answer to this, could you take the chance provided by this NFC commit to improve the comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This second commit could also use a brief commit message describing why this change is useful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment refers to the AttributeList field (the only such) of a Function which has e.g. parameters and return value attributes. I updated the comment in #112976 to (hopefully) make it a bit clearer.

// to remap the parameter indices of the AttributeList.
void llvm::CloneFunctionAttributesInto(Function *NewFunc,
const Function *OldFunc,
ValueToValueMapTy &VMap,
bool ModuleLevelChanges,
ValueMapTypeRemapper *TypeMapper,
ValueMaterializer *Materializer) {
AttributeList NewAttrs = NewFunc->getAttributes();
NewFunc->copyAttributesFrom(OldFunc);
NewFunc->setAttributes(NewAttrs);
Expand Down Expand Up @@ -147,6 +133,52 @@ void llvm::CloneFunctionInto(Function *NewFunc, const Function *OldFunc,
NewFunc->setAttributes(
AttributeList::get(NewFunc->getContext(), OldAttrs.getFnAttrs(),
OldAttrs.getRetAttrs(), NewArgAttrs));
}

DISubprogram *llvm::ProcessSubprogramAttachment(const Function &F,
CloneFunctionChangeType Changes,
DebugInfoFinder &DIFinder) {
DISubprogram *SPClonedWithinModule = nullptr;
if (Changes < CloneFunctionChangeType::DifferentModule) {
SPClonedWithinModule = F.getSubprogram();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM's coding guidelines prescribe not using {} around single line blocks

if (SPClonedWithinModule)
DIFinder.processSubprogram(SPClonedWithinModule);

const Module *M = F.getParent();
if (Changes != CloneFunctionChangeType::ClonedModule && M) {
// Inspect instructions to process e.g. DILexicalBlocks of inlined functions
for (const auto &BB : F) {
for (const auto &I : BB) {
DIFinder.processInstruction(*M, I);
}
}
}

return SPClonedWithinModule;
}

// Clone OldFunc into NewFunc, transforming the old arguments into references to
// VMap values.
void llvm::CloneFunctionInto(Function *NewFunc, const Function *OldFunc,
ValueToValueMapTy &VMap,
CloneFunctionChangeType Changes,
SmallVectorImpl<ReturnInst *> &Returns,
const char *NameSuffix, ClonedCodeInfo *CodeInfo,
ValueMapTypeRemapper *TypeMapper,
ValueMaterializer *Materializer) {
NewFunc->setIsNewDbgInfoFormat(OldFunc->IsNewDbgInfoFormat);
assert(NameSuffix && "NameSuffix cannot be null!");

#ifndef NDEBUG
for (const Argument &I : OldFunc->args())
assert(VMap.count(&I) && "No mapping from source argument specified!");
#endif

bool ModuleLevelChanges = Changes > CloneFunctionChangeType::LocalChangesOnly;

CloneFunctionAttributesInto(NewFunc, OldFunc, VMap, ModuleLevelChanges,
TypeMapper, Materializer);

// Everything else beyond this point deals with function instructions,
// so if we are dealing with a function declaration, we're done.
Expand All @@ -158,23 +190,19 @@ void llvm::CloneFunctionInto(Function *NewFunc, const Function *OldFunc,
// duplicate instructions and then freeze them in the MD map. We also record
// information about dbg.value and dbg.declare to avoid duplicating the
// types.
std::optional<DebugInfoFinder> DIFinder;
DebugInfoFinder DIFinder;

// Track the subprogram attachment that needs to be cloned to fine-tune the
// mapping within the same module.
DISubprogram *SPClonedWithinModule = nullptr;
if (Changes < CloneFunctionChangeType::DifferentModule) {
// Need to find subprograms, types, and compile units.

assert((NewFunc->getParent() == nullptr ||
NewFunc->getParent() == OldFunc->getParent()) &&
"Expected NewFunc to have the same parent, or no parent");

// Need to find subprograms, types, and compile units.
DIFinder.emplace();

SPClonedWithinModule = OldFunc->getSubprogram();
if (SPClonedWithinModule)
DIFinder->processSubprogram(SPClonedWithinModule);
} else {
// Need to find all the compile units.

assert((NewFunc->getParent() == nullptr ||
NewFunc->getParent() != OldFunc->getParent()) &&
"Expected NewFunc to have different parents, or no parent");
Expand All @@ -183,19 +211,20 @@ void llvm::CloneFunctionInto(Function *NewFunc, const Function *OldFunc,
assert(NewFunc->getParent() &&
"Need parent of new function to maintain debug info invariants");

// Need to find all the compile units.
DIFinder.emplace();
}
}

DISubprogram *SPClonedWithinModule =
ProcessSubprogramAttachment(*OldFunc, Changes, DIFinder);

// Loop over all of the basic blocks in the function, cloning them as
// appropriate. Note that we save BE this way in order to handle cloning of
// recursive functions into themselves.
for (const BasicBlock &BB : *OldFunc) {

// Create a new basic block and copy instructions into it!
BasicBlock *CBB = CloneBasicBlock(&BB, VMap, NameSuffix, NewFunc, CodeInfo,
DIFinder ? &*DIFinder : nullptr);
BasicBlock *CBB =
CloneBasicBlock(&BB, VMap, NameSuffix, NewFunc, CodeInfo, nullptr);

// Add basic block mapping.
VMap[&BB] = CBB;
Expand All @@ -218,7 +247,7 @@ void llvm::CloneFunctionInto(Function *NewFunc, const Function *OldFunc,
}

if (Changes < CloneFunctionChangeType::DifferentModule &&
DIFinder->subprogram_count() > 0) {
DIFinder.subprogram_count() > 0) {
// Turn on module-level changes, since we need to clone (some of) the
// debug info metadata.
//
Expand All @@ -233,24 +262,24 @@ void llvm::CloneFunctionInto(Function *NewFunc, const Function *OldFunc,

// Avoid cloning types, compile units, and (other) subprograms.
SmallPtrSet<const DISubprogram *, 16> MappedToSelfSPs;
for (DISubprogram *ISP : DIFinder->subprograms()) {
for (DISubprogram *ISP : DIFinder.subprograms()) {
if (ISP != SPClonedWithinModule) {
mapToSelfIfNew(ISP);
MappedToSelfSPs.insert(ISP);
}
}

// If a subprogram isn't going to be cloned skip its lexical blocks as well.
for (DIScope *S : DIFinder->scopes()) {
for (DIScope *S : DIFinder.scopes()) {
auto *LScope = dyn_cast<DILocalScope>(S);
if (LScope && MappedToSelfSPs.count(LScope->getSubprogram()))
mapToSelfIfNew(S);
}

for (DICompileUnit *CU : DIFinder->compile_units())
for (DICompileUnit *CU : DIFinder.compile_units())
mapToSelfIfNew(CU);

for (DIType *Type : DIFinder->types())
for (DIType *Type : DIFinder.types())
mapToSelfIfNew(Type);
} else {
assert(!SPClonedWithinModule &&
Expand Down Expand Up @@ -304,7 +333,7 @@ void llvm::CloneFunctionInto(Function *NewFunc, const Function *OldFunc,
SmallPtrSet<const void *, 8> Visited;
for (auto *Operand : NMD->operands())
Visited.insert(Operand);
for (auto *Unit : DIFinder->compile_units()) {
for (auto *Unit : DIFinder.compile_units()) {
MDNode *MappedUnit =
MapMetadata(Unit, VMap, RF_None, TypeMapper, Materializer);
if (Visited.insert(MappedUnit).second)
Expand Down