-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Deprecated][SampleFDO] Stale profile call-graph matching #92151
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
e6b7077
to
89bdc6e
Compare
@llvm/pr-subscribers-pgo @llvm/pr-subscribers-llvm-transforms Author: Lei Wang (wlei-llvm) ChangesProfile staleness could be due to function renaming. Given that sample profile loader relies on exact string matching, a trivial change in the function signature( such as This patch introduces stale profile function-level matching, targeting at identifying the trivial function renaming and reusing the old function profile. The main idea is to leverage the call-site anchor similarity to determine the renaming and limit the matching scope into profiled CFG edges. Some noteworthy details:
Verified on one Meta's internal big service, confirmed 90%+ of the found renaming pair is good. (There could be incorrect renaming pair if the num of the anchor is small, but checked that those functions are simple cold function) Patch is 37.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/92151.diff 5 Files Affected:
diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileMatcher.h b/llvm/include/llvm/Transforms/IPO/SampleProfileMatcher.h
index b6feca5d47035..f2feb9ba8832d 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleProfileMatcher.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleProfileMatcher.h
@@ -15,12 +15,14 @@
#define LLVM_TRANSFORMS_IPO_SAMPLEPROFILEMATCHER_H
#include "llvm/ADT/StringSet.h"
+#include "llvm/Analysis/ProfileSummaryInfo.h"
#include "llvm/Transforms/Utils/SampleProfileLoaderBaseImpl.h"
namespace llvm {
using AnchorList = std::vector<std::pair<LineLocation, FunctionId>>;
using AnchorMap = std::map<LineLocation, FunctionId>;
+using FunctionMap = HashKeyMap<std::unordered_map, FunctionId, Function *>;
// Sample profile matching - fuzzy match.
class SampleProfileMatcher {
@@ -58,6 +60,20 @@ class SampleProfileMatcher {
StringMap<std::unordered_map<LineLocation, MatchState, LineLocationHash>>
FuncCallsiteMatchStates;
+ struct RenameDecisionCacheHash {
+ uint64_t
+ operator()(const std::pair<const Function *, FunctionId> &P) const {
+ return hash_combine(P.first, P.second);
+ }
+ };
+ std::unordered_map<std::pair<const Function *, FunctionId>, bool,
+ RenameDecisionCacheHash>
+ RenameDecisionCache;
+
+ FunctionMap *SymbolMap;
+
+ std::shared_ptr<ProfileSymbolList> PSL;
+
// Profile mismatch statstics:
uint64_t TotalProfiledFunc = 0;
// Num of checksum-mismatched function.
@@ -80,9 +96,11 @@ class SampleProfileMatcher {
public:
SampleProfileMatcher(Module &M, SampleProfileReader &Reader,
const PseudoProbeManager *ProbeManager,
- ThinOrFullLTOPhase LTOPhase)
- : M(M), Reader(Reader), ProbeManager(ProbeManager), LTOPhase(LTOPhase){};
- void runOnModule();
+ ThinOrFullLTOPhase LTOPhase,
+ std::shared_ptr<ProfileSymbolList> PSL)
+ : M(M), Reader(Reader), ProbeManager(ProbeManager), LTOPhase(LTOPhase),
+ PSL(PSL) {};
+ void runOnModule(FunctionMap &SymbolMap);
void clearMatchingData() {
// Do not clear FuncMappings, it stores IRLoc to ProfLoc remappings which
// will be used for sample loader.
@@ -90,16 +108,20 @@ class SampleProfileMatcher {
}
private:
- FunctionSamples *getFlattenedSamplesFor(const Function &F) {
- StringRef CanonFName = FunctionSamples::getCanonicalFnName(F);
- auto It = FlattenedProfiles.find(FunctionId(CanonFName));
+ FunctionSamples *getFlattenedSamplesFor(const FunctionId &Fname) {
+ auto It = FlattenedProfiles.find(Fname);
if (It != FlattenedProfiles.end())
return &It->second;
return nullptr;
}
- void runOnFunction(Function &F);
- void findIRAnchors(const Function &F, AnchorMap &IRAnchors);
- void findProfileAnchors(const FunctionSamples &FS, AnchorMap &ProfileAnchors);
+ FunctionSamples *getFlattenedSamplesFor(const Function &F) {
+ StringRef CanonFName = FunctionSamples::getCanonicalFnName(F);
+ return getFlattenedSamplesFor(FunctionId(CanonFName));
+ }
+ void runBlockLevelMatching(Function &F);
+ void findIRAnchors(const Function &F, AnchorMap &IRAnchors) const;
+ void findProfileAnchors(const FunctionSamples &FS,
+ AnchorMap &ProfileAnchors) const;
// Record the callsite match states for profile staleness report, the result
// is saved in FuncCallsiteMatchStates.
void recordCallsiteMatchStates(const Function &F, const AnchorMap &IRAnchors,
@@ -160,6 +182,20 @@ class SampleProfileMatcher {
void runStaleProfileMatching(const Function &F, const AnchorMap &IRAnchors,
const AnchorMap &ProfileAnchors,
LocToLocMap &IRToProfileLocationMap);
+ void findIRNewCallees(Function &Caller,
+ const StringMap<Function *> &IRNewFunctions,
+ std::vector<Function *> &IRNewCallees);
+ float checkFunctionSimilarity(const Function &IRFunc,
+ const FunctionId &ProfFunc);
+ bool functionIsRenamedImpl(const Function &IRFunc,
+ const FunctionId &ProfFunc);
+ bool functionIsRenamed(const Function &IRFunc, const FunctionId &ProfFunc);
+ void
+ runFuncRenamingMatchingOnProfile(const StringMap<Function *> &IRNewFunctions,
+ FunctionSamples &FS,
+ FunctionMap &OldProfToNewSymbolMap);
+ void findIRNewFunctions(StringMap<Function *> &IRNewFunctions);
+ void runFuncLevelMatching();
void reportOrPersistProfileStats();
};
} // end namespace llvm
diff --git a/llvm/lib/Transforms/IPO/SampleProfile.cpp b/llvm/lib/Transforms/IPO/SampleProfile.cpp
index 0920179fb76b7..0b096668084ca 100644
--- a/llvm/lib/Transforms/IPO/SampleProfile.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfile.cpp
@@ -544,7 +544,7 @@ class SampleProfileLoader final : public SampleProfileLoaderBaseImpl<Function> {
/// Profle Symbol list tells whether a function name appears in the binary
/// used to generate the current profile.
- std::unique_ptr<ProfileSymbolList> PSL;
+ std::shared_ptr<ProfileSymbolList> PSL;
/// Total number of samples collected in this profile.
///
@@ -2076,7 +2076,7 @@ bool SampleProfileLoader::doInitialization(Module &M,
if (ReportProfileStaleness || PersistProfileStaleness ||
SalvageStaleProfile) {
MatchingManager = std::make_unique<SampleProfileMatcher>(
- M, *Reader, ProbeManager.get(), LTOPhase);
+ M, *Reader, ProbeManager.get(), LTOPhase, PSL);
}
return true;
@@ -2197,7 +2197,7 @@ bool SampleProfileLoader::runOnModule(Module &M, ModuleAnalysisManager *AM,
if (ReportProfileStaleness || PersistProfileStaleness ||
SalvageStaleProfile) {
- MatchingManager->runOnModule();
+ MatchingManager->runOnModule(SymbolMap);
MatchingManager->clearMatchingData();
}
diff --git a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
index d7613bce4c52e..2b07856252ecd 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
@@ -20,12 +20,22 @@ using namespace sampleprof;
#define DEBUG_TYPE "sample-profile-matcher"
+static cl::opt<bool> SalvageFunctionRenaming(
+ "salvage-function-renaming", cl::Hidden, cl::init(false),
+ cl::desc("Salvage stale profile by function renaming matching."));
+
+static cl::opt<unsigned> FuncRenamingSimilarityThreshold(
+ "func-renaming-similarity-threshold", cl::Hidden, cl::init(80),
+ cl::desc(
+ "The profile function is considered being renamed if the similarity "
+ "against IR is above the given number(percentage value)."));
+
extern cl::opt<bool> SalvageStaleProfile;
extern cl::opt<bool> PersistProfileStaleness;
extern cl::opt<bool> ReportProfileStaleness;
void SampleProfileMatcher::findIRAnchors(const Function &F,
- AnchorMap &IRAnchors) {
+ AnchorMap &IRAnchors) const {
// For inlined code, recover the original callsite and callee by finding the
// top-level inline frame. e.g. For frame stack "main:1 @ foo:2 @ bar:3", the
// top-level frame is "main:1", the callsite is "1" and the callee is "foo".
@@ -95,7 +105,7 @@ void SampleProfileMatcher::findIRAnchors(const Function &F,
}
void SampleProfileMatcher::findProfileAnchors(const FunctionSamples &FS,
- AnchorMap &ProfileAnchors) {
+ AnchorMap &ProfileAnchors) const {
auto isInvalidLineOffset = [](uint32_t LineOffset) {
return LineOffset & 0x8000;
};
@@ -260,6 +270,22 @@ void SampleProfileMatcher::matchNonCallsiteLocs(
}
}
+// Filter the non-call locations from IRAnchors and ProfileAnchors and write
+// them into a list for random access later.
+static void getFilteredAnchorList(const AnchorMap &IRAnchors,
+ const AnchorMap &ProfileAnchors,
+ AnchorList &FilteredIRAnchorsList,
+ AnchorList &FilteredProfileAnchorList) {
+ for (const auto &I : IRAnchors) {
+ if (I.second.stringRef().empty())
+ continue;
+ FilteredIRAnchorsList.emplace_back(I);
+ }
+
+ for (const auto &I : ProfileAnchors)
+ FilteredProfileAnchorList.emplace_back(I);
+}
+
// Call target name anchor based profile fuzzy matching.
// Input:
// For IR locations, the anchor is the callee name of direct callsite; For
@@ -286,16 +312,9 @@ void SampleProfileMatcher::runStaleProfileMatching(
"Run stale profile matching only once per function");
AnchorList FilteredProfileAnchorList;
- for (const auto &I : ProfileAnchors)
- FilteredProfileAnchorList.emplace_back(I);
-
AnchorList FilteredIRAnchorsList;
- // Filter the non-callsite from IRAnchors.
- for (const auto &I : IRAnchors) {
- if (I.second.stringRef().empty())
- continue;
- FilteredIRAnchorsList.emplace_back(I);
- }
+ getFilteredAnchorList(IRAnchors, ProfileAnchors, FilteredIRAnchorsList,
+ FilteredProfileAnchorList);
if (FilteredIRAnchorsList.empty() || FilteredProfileAnchorList.empty())
return;
@@ -311,7 +330,7 @@ void SampleProfileMatcher::runStaleProfileMatching(
matchNonCallsiteLocs(MatchedAnchors, IRAnchors, IRToProfileLocationMap);
}
-void SampleProfileMatcher::runOnFunction(Function &F) {
+void SampleProfileMatcher::runBlockLevelMatching(Function &F) {
// We need to use flattened function samples for matching.
// Unlike IR, which includes all callsites from the source code, the callsites
// in profile only show up when they are hit by samples, i,e. the profile
@@ -590,13 +609,260 @@ void SampleProfileMatcher::computeAndReportProfileStaleness() {
}
}
-void SampleProfileMatcher::runOnModule() {
+// Find functions that don't show in the profile or profile symbol list, which
+// are supposed to be new functions. We use them as the targets for renaming
+// matching.
+void SampleProfileMatcher::findIRNewFunctions(
+ StringMap<Function *> &IRNewFunctions) {
+ // TODO: Support MD5 profile.
+ if (FunctionSamples::UseMD5)
+ return;
+ StringSet<> NamesInProfile;
+ if (auto NameTable = Reader.getNameTable()) {
+ for (auto Name : *NameTable)
+ NamesInProfile.insert(Name.stringRef());
+ }
+
+ for (auto &F : M) {
+ // Skip declarations, as even if the function can be recognized renamed, we
+ // have nothing to do with it.
+ if (F.isDeclaration())
+ continue;
+
+ StringRef CanonFName = FunctionSamples::getCanonicalFnName(F.getName());
+ const auto *FS = getFlattenedSamplesFor(F);
+ if (FS)
+ continue;
+
+ // For extended binary, the full function name symbols exits in the profile
+ // symbol list table.
+ if (NamesInProfile.count(CanonFName))
+ continue;
+
+ if (PSL && PSL->contains(CanonFName))
+ continue;
+
+ LLVM_DEBUG(dbgs() << "Function " << CanonFName
+ << " is not in profile or symbol list table.\n");
+ IRNewFunctions[CanonFName] = &F;
+ }
+}
+
+void SampleProfileMatcher::findIRNewCallees(
+ Function &Caller, const StringMap<Function *> &IRNewFunctions,
+ std::vector<Function *> &IRNewCallees) {
+ for (auto &BB : Caller) {
+ for (auto &I : BB) {
+ const auto *CB = dyn_cast<CallBase>(&I);
+ if (!CB || isa<IntrinsicInst>(&I))
+ continue;
+ Function *Callee = CB->getCalledFunction();
+ if (!Callee || Callee->isDeclaration())
+ continue;
+ StringRef CalleeName =
+ FunctionSamples::getCanonicalFnName(Callee->getName());
+ if (IRNewFunctions.count(CalleeName))
+ IRNewCallees.push_back(Callee);
+ }
+ }
+}
+
+// Use function similarity to determine if the function is renamed. Compute a
+// similarity ratio between two sequences which are the function callsite
+// anchors. The returned value is in the range [0, 1]. The bigger the value is,
+// the more similar two sequences are.
+float SampleProfileMatcher::checkFunctionSimilarity(
+ const Function &IRFunc, const FunctionId &ProfFName) {
+ AnchorMap IRAnchors;
+ findIRAnchors(IRFunc, IRAnchors);
+
+ AnchorMap ProfileAnchors;
+ const auto *FSFlattened = getFlattenedSamplesFor(ProfFName);
+ assert(FSFlattened && "Flattened profile sample is null");
+ findProfileAnchors(*FSFlattened, ProfileAnchors);
+
+ AnchorList FilteredProfileAnchorList;
+ AnchorList FilteredIRAnchorsList;
+ getFilteredAnchorList(IRAnchors, ProfileAnchors, FilteredIRAnchorsList,
+ FilteredProfileAnchorList);
+
+ // If the function is probe based, we trust the checksum info to check the
+ // similarity. Otherwise, if the checksum is mismatched, continue computing
+ // the similarity.
+ if (FunctionSamples::ProfileIsProbeBased) {
+ const auto *FuncDesc = ProbeManager->getDesc(IRFunc);
+ // Make sure function is complex enough.
+ if (IRAnchors.size() - FilteredIRAnchorsList.size() > 5 && FuncDesc &&
+ !ProbeManager->profileIsHashMismatched(*FuncDesc, *FSFlattened)) {
+ return 1.0;
+ }
+ }
+
+ if (FilteredIRAnchorsList.empty() || FilteredProfileAnchorList.empty())
+ return 0.0;
+
+ // Use the diff algorithm to find the LCS between IR and profile.
+ LocToLocMap MatchedAnchors =
+ longestCommonSequence(FilteredIRAnchorsList, FilteredProfileAnchorList);
+
+ return static_cast<float>(MatchedAnchors.size()) * 2 /
+ (FilteredIRAnchorsList.size() + FilteredProfileAnchorList.size());
+}
+
+bool SampleProfileMatcher::functionIsRenamedImpl(const Function &IRFunc,
+ const FunctionId &ProfFunc) {
+ float Similarity = checkFunctionSimilarity(IRFunc, ProfFunc);
+ LLVM_DEBUG(dbgs() << "The similarity between " << IRFunc.getName()
+ << "(IR) and " << ProfFunc << "(profile) is "
+ << format("%.2f", Similarity) << "\n");
+ return Similarity * 100 > FuncRenamingSimilarityThreshold;
+}
+
+bool SampleProfileMatcher::functionIsRenamed(const Function &IRFunc,
+ const FunctionId &ProfFunc) {
+ auto R = RenameDecisionCache.find({&IRFunc, ProfFunc});
+ if (R != RenameDecisionCache.end())
+ return R->second;
+
+ bool V = functionIsRenamedImpl(IRFunc, ProfFunc);
+ RenameDecisionCache[{&IRFunc, ProfFunc}] = V;
+ return V;
+}
+
+// Run function renaming matching on the profiled CFG edge to limit the matching
+// scope.
+void SampleProfileMatcher::runFuncRenamingMatchingOnProfile(
+ const StringMap<Function *> &IRNewFunctions, FunctionSamples &CallerFS,
+ FunctionMap &OldProfToNewSymbolMap) {
+ auto FindIRFunction = [&](const FunctionId &FName) {
+ // Function can be null if name has conflict, use optional to store the
+ // function pointer.
+ std::optional<Function *> F;
+
+ auto R = SymbolMap->find(FName);
+ if (R != SymbolMap->end())
+ F = R->second;
+
+ auto NewR = OldProfToNewSymbolMap.find(FName);
+ if (NewR != OldProfToNewSymbolMap.end())
+ F = NewR->second;
+
+ return F;
+ };
+
+ // Find the new callees from IR in the current caller scope.
+ std::vector<Function *> IRNewCallees;
+ auto Caller = FindIRFunction(CallerFS.getFunction());
+ if (Caller.has_value() && *Caller) {
+ // No callees for external function, skip the rename matching.
+ if ((*Caller)->isDeclaration())
+ return;
+ findIRNewCallees(**Caller, IRNewFunctions, IRNewCallees);
+ }
+
+ // Run renaming matching on CFG edge(caller-callee).
+ for (auto &CM :
+ const_cast<CallsiteSampleMap &>(CallerFS.getCallsiteSamples())) {
+ auto &CalleeMap = CM.second;
+ // Local container used to update the CallsiteSampleMap.
+ std::vector<std::pair<FunctionId, FunctionSamples *>> FSamplesToUpdate;
+ for (auto &CS : CalleeMap) {
+ auto &CalleeFS = CS.second;
+ auto ProfCallee = CalleeFS.getFunction();
+ auto ExistingIRCallee = FindIRFunction(ProfCallee);
+ // The profile callee is new, run renaming matching.
+ if (!ExistingIRCallee.has_value()) {
+ for (auto *IRCallee : IRNewCallees) {
+ if (functionIsRenamed(*IRCallee, ProfCallee)) {
+ FSamplesToUpdate.emplace_back(ProfCallee, &CalleeFS);
+ OldProfToNewSymbolMap[ProfCallee] = IRCallee;
+ // Update the profile in place so that the deeper level matching
+ // will find the IR function.
+ CalleeFS.setFunction(FunctionId(IRCallee->getName()));
+ LLVM_DEBUG(dbgs() << "Callee renaming is found in function "
+ << CallerFS.getFunction()
+ << ", changing profile name from " << ProfCallee
+ << " to " << IRCallee->getName() << "\n");
+ break;
+ }
+ }
+ } else {
+ // Apply the existing renaming result.
+ auto R = OldProfToNewSymbolMap.find(CalleeFS.getFunction());
+ if (R != OldProfToNewSymbolMap.end()) {
+ FunctionId IRNewCallee(R->second->getName());
+ assert(IRNewCallee != ProfCallee &&
+ "New callee symbol is not a new function");
+ FSamplesToUpdate.emplace_back(ProfCallee, &CalleeFS);
+ CalleeFS.setFunction(IRNewCallee);
+ LLVM_DEBUG(dbgs() << "Existing callee renaming is found in function "
+ << CallerFS.getFunction()
+ << ", changing profile name from " << ProfCallee
+ << " to " << IRNewCallee << "\n");
+ }
+ }
+ // Note that even there is no renaming in the current scope, there could
+ // be renaming in deeper callee scope, we need to traverse all the callee
+ // profiles.
+ runFuncRenamingMatchingOnProfile(IRNewFunctions, CalleeFS,
+ OldProfToNewSymbolMap);
+ }
+
+ // Update the CalleeMap using the new name and remove the old entry.
+ for (auto &P : FSamplesToUpdate) {
+ assert((P.first != P.second->getFunction()) &&
+ "Renamed function name should be different from the old map key");
+ CalleeMap[P.second->getFunction()] = *P.second;
+ CalleeMap.erase(P.first);
+ }
+ }
+}
+
+void SampleProfileMatcher::runFuncLevelMatching() {
+ if (!SalvageFunctionRenaming)
+ return;
+ assert(SymbolMap && "SymbolMap points to null");
+
+ StringMap<Function *> IRNewFunctions;
+ findIRNewFunctions(IRNewFunctions);
+ if (IRNewFunctions.empty())
+ return;
+
+ // The new functions found by the renaming matching. Save them into a map
+ // whose key is the old(profile) function name and value is the new(renamed)
+ // function.
+ FunctionMap OldProfToNewSymbolMap;
+ for (auto &I : Reader.getProfiles())
+ runFuncRenamingMatchingOnProfile(IRNewFunctions, I.second,
+ OldProfToNewSymbolMap);
+
+ // Update all the data generated by the old profile.
+ if (!OldProfToNewSymbolMap.empty()) {
+ // Add the new function to the SymbolMap, which will be used in
+ // SampleLoader.
+ for (auto &I : OldProfToNewSymbolMap) {
+ assert(I.second && "New function is null");
+ SymbolMap->emplace(FunctionId(I.second->getName()), I.second);
+ }
+
+ // Re-flatten the profiles after the renaming.
+ FlattenedProfiles.clear();
+ ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles,
+ FunctionSamples::ProfileIsCS);
+ }
+ RenameDecisionCache.clear();
+}
+
+void SampleProfileMatcher::runOnModule(FunctionMap &SymMap) {
ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles,
FunctionSamples::ProfileIsCS);
+ SymbolMap = &SymMap;
+ runFuncLevelMatching();
+
for (auto &F : M) {
if (skipProfileForFunction(F))
continue;
- runOnFunction(F);
+ runBlockLevelMatching(F);
}
if (SalvageStaleProfile)
distributeIRToProfileLocationMap();
diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-stale-profile-renaming.prof b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-stale-profile-renaming.prof
new file mode 100644
index 0000000000000..1e23ef26d1b15
--- /dev/n...
[truncated]
|
There is no guarantee that these wrong matching can only happen in cold functions. I'm wondering if we should skip name matching attempt for tiny functions where don't have enough signal for high confidence matching. |
@@ -20,12 +20,22 @@ using namespace sampleprof; | |||
|
|||
#define DEBUG_TYPE "sample-profile-matcher" | |||
|
|||
static cl::opt<bool> SalvageFunctionRenaming( | |||
"salvage-function-renaming", cl::Hidden, cl::init(false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: salvage-renamed-profile
to be consistent with salvage-stale-profile
cl::desc("Salvage stale profile by function renaming matching.")); | ||
|
||
static cl::opt<unsigned> FuncRenamingSimilarityThreshold( | ||
"func-renaming-similarity-threshold", cl::Hidden, cl::init(80), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: renamed-func-similarity-threshold
std::unordered_map<std::pair<const Function *, FunctionId>, bool, | ||
RenameDecisionCacheHash> | ||
RenameDecisionCache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the linter suggest format/style?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes :facepalm
@@ -260,6 +270,22 @@ void SampleProfileMatcher::matchNonCallsiteLocs( | |||
} | |||
} | |||
|
|||
// Filter the non-call locations from IRAnchors and ProfileAnchors and write | |||
// them into a list for random access later. | |||
static void getFilteredAnchorList(const AnchorMap &IRAnchors, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why file static function, instead of member (static) function for SampleProfileMatcher?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to member function.
StringRef CanonFName = FunctionSamples::getCanonicalFnName(F); | ||
return getFlattenedSamplesFor(FunctionId(CanonFName)); | ||
} | ||
void runBlockLevelMatching(Function &F); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: runCFGMatching
@@ -58,6 +60,20 @@ class SampleProfileMatcher { | |||
StringMap<std::unordered_map<LineLocation, MatchState, LineLocationHash>> | |||
FuncCallsiteMatchStates; | |||
|
|||
struct RenameDecisionCacheHash { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming: rename is not a decision. More like FunctionProfileNameMap
?
Also I suggest we don't put too much emphasis on "renaming", because this is more general than handling renaming as renaming is one of the causes.
ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles, | ||
FunctionSamples::ProfileIsCS); | ||
} | ||
RenameDecisionCache.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also assert the cache is empty on function entry?
void SampleProfileMatcher::runFuncLevelMatching() { | ||
if (!SalvageFunctionRenaming) | ||
return; | ||
assert(SymbolMap && "SymbolMap points to null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: SymbolMap points to null -> SymbolMap is null
F = R->second; | ||
|
||
auto NewR = OldProfToNewSymbolMap.find(FName); | ||
if (NewR != OldProfToNewSymbolMap.end()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if SymbolMap
already has the function, do we still need to look it up in OldProfToNewSymbolMap
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, Good catch, changed to return if entry is found in SymbolMap
.
findIRNewCallees(**Caller, IRNewFunctions, IRNewCallees); | ||
} | ||
|
||
// Run renaming matching on CFG edge(caller-callee). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renaming matching
-> function to profile
matching?
auto &CalleeFS = CS.second; | ||
auto ProfCallee = CalleeFS.getFunction(); | ||
auto ExistingIRCallee = FindIRFunction(ProfCallee); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are too many auto
here while the types are not too long to spell. suggest to spell out the types to help readability.
const StringMap<Function *> &IRNewFunctions, FunctionSamples &CallerFS, | ||
FunctionMap &OldProfToNewSymbolMap) { | ||
auto FindIRFunction = [&](const FunctionId &FName) { | ||
// Function can be null if name has conflict, use optional to store the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do we put null into either OldProfToNewSymbolMap or SymbolMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's in the SampleLoader, right before the MatchingManager->runOnModule(SymbolMap);
, see the comment in https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/IPO/SampleProfile.cpp#L2179-L2185.
// Note that even there is no renaming in the current scope, there could | ||
// be renaming in deeper callee scope, we need to traverse all the callee | ||
// profiles. | ||
runFuncRenamingMatchingOnProfile(IRNewFunctions, CalleeFS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we do CFG profile matching, we use flattened profile so profile can have more anchors and it's easier to match IR. In this case, I see that checkFunctionSimilarity
also use flattened profile. Should we operate this call graph level matching completely on flattened profile so we avoid the recursion here for inlinees? There is no need to repeatedly match multiple instance of an inlinee (even though there is cache).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good point. But note that this function doesn't only do the matching, but also update/write the original profile,(i,e. apply the matching results), that's why we needed to run recursively on the original profile, (see the FSamplesToUpdate
is used to update the original profile.)
That said, if we want to do it on flattened profile, one option is to split it into two function, one function for the matching using flattened profile and one function to update the original profile, that might make each function clearer and simpler.(CFG matching kind of works in this way)
Any thoughts?
|
||
// Run renaming matching on CFG edge(caller-callee). | ||
for (auto &CM : | ||
const_cast<CallsiteSampleMap &>(CallerFS.getCallsiteSamples())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this style of matching profile to function only works for inlined functions? What if a function is not inlined anywhere, but is renamed?
Should we also consider not-inlined callsites from body samples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good catch!
I think we can do non-inlined callsite. I was originally planning to do non-inlined callsites, but later I was worried that it might increase build speed and it's less important than inline callites since those non-inlined callsites are likely cold functions, so I skipped them, then later I found that the build speed cost looks not a concern, but I forgot to add it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I still have a small concern is whether we need to consider cold function matching. I was using the !PSI->isHot()
to filter them but I removed it in this version. I guess in normal situation, user code won't have a lot of renaming, so build speed should not be a concern(verified on our internal service). just not very sure if we can have special cases that user code contains so many new functions, it can significantly slow down the matching. (maybe I just overthink..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's less important than inline callites since those non-inlined callsites are likely cold functions
Not inlined may not be cold. Caller/callee can have incompatible attributes that forbids inlining even though call is hot. For such cases, we still want to be able to do the CG level matching.
just not very sure if we can have special cases that user code contains so many new functions, it can significantly slow down the matching. (maybe I just overthink..)
as long as the stale profile matching algorithm scales linearly to number of new functions, it should be fine I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a ton for the awesome code review!
There is no guarantee that these wrong matching can only happen in cold functions. I'm wondering if we should skip name matching attempt for tiny functions where don't have enough signal for high confidence matching.
Yeah, sounds good, I tried to skip using the checksum for tiny function ( the "magic" 5 num block..) we can apply the same to the general matching.
// Note that even there is no renaming in the current scope, there could | ||
// be renaming in deeper callee scope, we need to traverse all the callee | ||
// profiles. | ||
runFuncRenamingMatchingOnProfile(IRNewFunctions, CalleeFS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good point. But note that this function doesn't only do the matching, but also update/write the original profile,(i,e. apply the matching results), that's why we needed to run recursively on the original profile, (see the FSamplesToUpdate
is used to update the original profile.)
That said, if we want to do it on flattened profile, one option is to split it into two function, one function for the matching using flattened profile and one function to update the original profile, that might make each function clearer and simpler.(CFG matching kind of works in this way)
Any thoughts?
|
||
// Run renaming matching on CFG edge(caller-callee). | ||
for (auto &CM : | ||
const_cast<CallsiteSampleMap &>(CallerFS.getCallsiteSamples())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good catch!
I think we can do non-inlined callsite. I was originally planning to do non-inlined callsites, but later I was worried that it might increase build speed and it's less important than inline callites since those non-inlined callsites are likely cold functions, so I skipped them, then later I found that the build speed cost looks not a concern, but I forgot to add it back.
|
||
// Run renaming matching on CFG edge(caller-callee). | ||
for (auto &CM : | ||
const_cast<CallsiteSampleMap &>(CallerFS.getCallsiteSamples())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I still have a small concern is whether we need to consider cold function matching. I was using the !PSI->isHot()
to filter them but I removed it in this version. I guess in normal situation, user code won't have a lot of renaming, so build speed should not be a concern(verified on our internal service). just not very sure if we can have special cases that user code contains so many new functions, it can significantly slow down the matching. (maybe I just overthink..)
continue; | ||
|
||
// For extended binary, the full function name symbols exits in the profile | ||
// symbol list table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the comment, it's both needed for extended binary.
const StringMap<Function *> &IRNewFunctions, FunctionSamples &CallerFS, | ||
FunctionMap &OldProfToNewSymbolMap) { | ||
auto FindIRFunction = [&](const FunctionId &FName) { | ||
// Function can be null if name has conflict, use optional to store the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's in the SampleLoader, right before the MatchingManager->runOnModule(SymbolMap);
, see the comment in https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/IPO/SampleProfile.cpp#L2179-L2185.
F = R->second; | ||
|
||
auto NewR = OldProfToNewSymbolMap.find(FName); | ||
if (NewR != OldProfToNewSymbolMap.end()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, Good catch, changed to return if entry is found in SymbolMap
.
if (FunctionSamples::ProfileIsProbeBased) { | ||
const auto *FuncDesc = ProbeManager->getDesc(IRFunc); | ||
// Make sure function is complex enough. | ||
if (IRAnchors.size() - FilteredIRAnchorsList.size() > 5 && FuncDesc && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason here is if the function is too simple, like only containing one entry block or only one if-else, in other words, the num block is small, it's likely to get the checksum conflict and give a wrong matching result, I checked from the results from one service, there are some wrong matching because of this.
So, 5 is mostly my guess from this, basically I wanted to avoid 1 basic block or simple one or two if-else.
std::unordered_map<std::pair<const Function *, FunctionId>, bool, | ||
RenameDecisionCacheHash> | ||
RenameDecisionCache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes :facepalm
In the new commit:
|
matchProfileForNewFunctions(newIRFunctions, *P, OldProfToNewSymbolMap); | ||
|
||
// Update all the data generated by the old profile. | ||
if (!OldProfToNewSymbolMap.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possible to early return when it is empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming you meant to save the indentation, yes, we can move ahead the cache clear function, done.
const_cast<SampleRecord::CallTargetMap &>(BS.second.getCallTargets()); | ||
for (const auto &TS : CTM) { | ||
const FunctionId &ProfCallee = TS.first; | ||
auto MatchRes = findOrMatchNewFunction(ProfCallee, OldProfToNewSymbolMap, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this O(N^2) with N being the number of callees?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory yes, but here the callees sets are not all the callees but only new functions(filter out existing matched functions), so it's more like New callees from IR
* New callees from Profile
, I think either of them should be small in reality.
|
||
const auto *FSFlattened = getFlattenedSamplesFor(ProfFunc); | ||
assert(FSFlattened && "Flattened profile sample is null"); | ||
// Similarity check may not be reiable if the function is tiny, we use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/reiable/reliable/
} | ||
} | ||
|
||
std::pair<Function *, bool> SampleProfileMatcher::findOrMatchFunction( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is used to lookup IR function for caller as well, so the parameter name ProfCallee is not accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, changed to ProfFunc
return; | ||
findNewIRCallees(*IRCaller, NewIRFunctions, NewIRCallees); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it early return if the new IR caller is not found ? In this case the NewIRCallees is empty, so the following matching seems just doing redundant work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above, we want to run matching for all callers' callees, here is to traverse the nested profile to process all the caller. Even if there is no new callees in the current caller, there could be new callees in the inlined/nested callers. See line 835 there is a recursive call for matchProfileForNewFunctions
which is to run matching for children callers.
std::vector<Function *> NewIRCallees; | ||
if (auto *IRCaller = | ||
findOrMatchFunction(CallerFS.getFunction(), OldProfToNewSymbolMap) | ||
.first) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why limiting it to new IRcaller only? Any pair of matching ProfileCaller and IRCaller can be used to drive callee matching below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to clarify: this is not to limit to new IRCaller, it is for all callers. Note that here there is no 3rd parameter for function findOrMatchFunction
, which has a default argument for the NewIRCallees
(const std::vector<Function *> &NewIRCallees = std::vector<Function *>()
), in this case, it's just an empty vector which means it doesn't do any new function matching, so it's only for lookup for existing functions.
The NewIRCallees
in 774 is used in findNewIRCallees
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. It is checking '.first', not .second. Is it possible to refactor the function and split into two interfaces? One is the readonly part -- does 'find' only and make the findOrMatch to call 'find' first and if not found, try to match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, in that way we can just call the find
here which makes the code clearer. Thanks for the suggestion!
} | ||
} | ||
|
||
std::pair<Function *, SampleProfileMatcher::MatchState> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This is a bit tricky, the function can be null even if it's found in SymbolMap, so we can't relay on checking the null of the return to tell if's found. So here I use the a enum MatchState.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked too closely, but I'm hoping that a std::pair<Function *, bool>
would be enough. MatchState
makes it quite a bit complicated.
void SampleProfileMatcher::findNewIRCallees( | ||
Function &Caller, const StringMap<Function *> &NewIRFunctions, | ||
std::vector<Function *> &NewIRCallees) { | ||
for (auto &BB : Caller) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that this function may be run redundantly, we traverse the nested tree for all callers, but it's possible that the same function appears multiple times in different inline context and will be run repeatedly. Then the solution could be use a cache or run on a flattened profile(but that needs a new function to update the original profile)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Can we have a comment for high level algorithm description in the code? Also include that in the change summary.
- Should we have some stats to tell how many new functions are successfully matched with a profile?
@@ -20,12 +20,31 @@ using namespace sampleprof; | |||
|
|||
#define DEBUG_TYPE "sample-profile-matcher" | |||
|
|||
static cl::opt<bool> SalvageRenamedProfile( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove "rename" here to be consistent? For functions, it's new functions, for profiles it's unused profile. So maybe: salvage-unused-profile
?
"salvage-renamed-profile", cl::Hidden, cl::init(false), | ||
cl::desc("Salvage renamed profile by function renaming matching.")); | ||
|
||
static cl::opt<unsigned> RenamedFuncSimilarityThreshold( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func-profile-similarity-threshold
?
FunctionSamples &CalleeFS = CS.second; | ||
FunctionId ProfCallee = CalleeFS.getFunction(); | ||
auto MatchRes = | ||
findOrMatchFunction(ProfCallee, OldProfToNewSymbolMap, NewIRCallees); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we still do this for even if NewIRCallees.empty()
? Note that in the non-inined case, there is if (NewIRCallees.empty())
check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a recursive function, there needs to call the nested inlinees, so we can't early return/break, added a comment to explain this. Also the findOrMatchFunction
is replaced with matchCalleeProfile
FunctionMap &OldProfToNewSymbolMap) { | ||
// Find the new candidate callees from IR in the current caller scope. | ||
std::vector<Function *> NewIRCallees; | ||
if (auto *IRCaller = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't even have a matching IRCaller
, why do we continue here instead of just return?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because we need to run matching for the deeper children caller, note that this is a recursive function on a tree, not only run for one function node. I added more comments to explain this.
} | ||
} | ||
|
||
std::pair<Function *, SampleProfileMatcher::MatchState> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked too closely, but I'm hoping that a std::pair<Function *, bool>
would be enough. MatchState
makes it quite a bit complicated.
} | ||
} | ||
|
||
// Match inline callees. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we refactor the code to avoid duplication between inlined and non-inline cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, factor out a new function matchCalleeProfile
.
void SampleProfileMatcher::runOnModule(FunctionMap &SymMap) { | ||
ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles, | ||
FunctionSamples::ProfileIsCS); | ||
SymbolMap = &SymMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can take the pointer of SymbolMap
in ctor, and assert SymbolMap
is populated in runOnModule
// The new functions found by the renaming matching. Save them into a map | ||
// whose key is the old(profile) function name and value is the new(renamed) | ||
// function. | ||
FunctionMap OldProfToNewSymbolMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think of SymbolMap as function name to function map. As you can see from SampleProfileLoader::runOnModule
, it doesn't involve profile name when being constructed.
So SymbolMap is different from ProfileNameToFuncMap
// matching result. | ||
std::unordered_map<std::pair<const Function *, FunctionId>, bool, | ||
FuncProfNameMapHash> | ||
FunctionProfileNameMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I was suggesting
FunctionProfileNameMap
-> FuncToProfileNameMap
OldProfToNewSymbolMap
-> ProfileNameToFuncMap
because symmetric naming helps readability. I saw you changed the other one, but not this one.
/// Find the existing or new matched function using the profile name. | ||
/// | ||
/// \returns The function pointer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why using ///
style comment, while existing code in this file all use //
style comment.
Also some comments has param
sections while others don't. I would suggest simple //
style comment.
std::vector<Function *> *SampleProfileMatcher::findNewIRCallees( | ||
Function &Func, const StringMap<Function *> &NewIRFunctions) { | ||
auto R = FuncToNewCalleesMap.try_emplace(&Func, std::vector<Function *>()); | ||
std::vector<Function *> &IRCalleesToMatch = R.first->second; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function by itself has nothing to do with matching, so the name IRCalleesToMatch
can adds confusion. This is about finding new IR callees, so it should be named NewIRCallees
. Whether the result is used to match something is outside of the scope of this function, and should not be used in the naming here.
|
||
// Sort the profiles to make the matching order deterministic. | ||
std::vector<NameFunctionSamples> SortedProfiles; | ||
::llvm::sortFuncProfiles(Reader.getProfiles(), SortedProfiles); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have using namespace llvm;
at the beginning, is the full qualifier ::llvm::
necessary?
matchProfileForNewFunctions(NewIRFunctions, | ||
*const_cast<FunctionSamples *>(P.second)); | ||
|
||
clearCacheData(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: inline it here and comment explaining this is for freeing memory (?)
Update:
|
Replaced by "#95135"
Profile staleness could be due to function renaming. Given that sample profile loader relies on exact string matching, a trivial change in the function signature( such as
int foo()
-->long foo()
) can make the mangled name different, the function profile(including all nested children profile) becomes unavailable.This patch introduces stale profile call-graph level matching, targeting at identifying the trivial function renaming and reusing the old function profile. The main idea is to leverage the call-site anchor similarity to determine the renaming and limit the matching scope into profiled call-graph edges.
Some noteworthy details:
findNewIRCallees
.matchCalleeProfile
.MatchProfileForNewFunctions
.The inputs are the two new function(doesn't appear in other side) lists from IR and Profile, say a)
IRRenamingFuncList
. b)ProfileRenamingFuncList
. A new functionfunctionMatchesProfile(IRFunc, ProfFunc)
is used to check the renaming for the possible <IRFunc, ProfFunc> pair inIRRenamingFuncList
XProfileRenamingFuncList
In
functionMatchesProfile(IRFunc, ProfFunc)
, extract call-site anchors and use the LCS(diff) matching to compute the equal set and we define:Similarity = |equalSet * 2| / (|A| + |B|)
. The profile name is marked as renamed if the similarity is above a threshold(--renamed-func-similarity-threshold
) and update the profile using the new name.Added a new switch
--salvage-renamed-profile
to control this, default is false.This is also complementary to the CFG level(block/call) matching, which is after the CG matching, the new identified function can further be run by the CFG matching.
Verified on one Meta's internal big service, confirmed 90%+ of the found renaming pair is good. (There could be incorrect renaming pair if the num of the anchor is small, but checked that those functions are simple cold function)