Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ctx_prof] Extend WorkloadImportsManager to use the contextual profile #98682

Merged
merged 9 commits into from
Jul 29, 2024

Conversation

mtrofin
Copy link
Member

@mtrofin mtrofin commented Jul 12, 2024

Keeping the json-based input as it's useful for diagnostics or for driving the import by other means than contextual composition.

The support for the contextual profile is just another modality for constructing the import list (WorkloadImportsManager::Workloads). Everything else - i.e. the actual importing logic - is already independent from how that list was obtained.

Issue #89287

Keeping the json-based input as it's useful for diagnostics or for driving
the import by other means than contextual composition.

The support for the contextual profile is just another modality for
constructing the import list (`WorkloadImportsManager::Workloads`).
Everything else - i.e. the actual importing logic - is already independent
from how that list was obtained.
@llvmbot llvmbot added PGO Profile Guided Optimizations LTO Link time optimization (regular/full LTO or ThinLTO) llvm:transforms labels Jul 12, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 12, 2024

@llvm/pr-subscribers-lto
@llvm/pr-subscribers-pgo

@llvm/pr-subscribers-llvm-transforms

Author: Mircea Trofin (mtrofin)

Changes

Keeping the json-based input as it's useful for diagnostics or for driving the import by other means than contextual composition.

The support for the contextual profile is just another modality for constructing the import list (WorkloadImportsManager::Workloads). Everything else - i.e. the actual importing logic - is already independent from how that list was obtained.


Full diff: https://github.com/llvm/llvm-project/pull/98682.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/FunctionImport.cpp (+68-8)
  • (modified) llvm/unittests/ProfileData/PGOCtxProfReaderWriterTest.cpp (+9)
diff --git a/llvm/lib/Transforms/IPO/FunctionImport.cpp b/llvm/lib/Transforms/IPO/FunctionImport.cpp
index 038785114a0cf..985e419c2bb5e 100644
--- a/llvm/lib/Transforms/IPO/FunctionImport.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionImport.cpp
@@ -18,6 +18,7 @@
 #include "llvm/ADT/Statistic.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/Bitcode/BitcodeReader.h"
+#include "llvm/Bitstream/BitstreamWriter.h"
 #include "llvm/IR/AutoUpgrade.h"
 #include "llvm/IR/Constants.h"
 #include "llvm/IR/Function.h"
@@ -30,6 +31,7 @@
 #include "llvm/IR/ModuleSummaryIndex.h"
 #include "llvm/IRReader/IRReader.h"
 #include "llvm/Linker/IRMover.h"
+#include "llvm/ProfileData/PGOCtxProfReader.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/Debug.h"
@@ -174,6 +176,10 @@ static cl::opt<std::string> WorkloadDefinitions(
              "}"),
     cl::Hidden);
 
+static cl::opt<std::string>
+    ContextualProfile("thinlto-pgo-ctx-prof",
+                      cl::desc("Path to a contextual profile."), cl::Hidden);
+
 namespace llvm {
 extern cl::opt<bool> EnableMemProfContextDisambiguation;
 }
@@ -586,13 +592,7 @@ class WorkloadImportsManager : public ModuleImportsManager {
     LLVM_DEBUG(dbgs() << "[Workload] Done\n");
   }
 
-public:
-  WorkloadImportsManager(
-      function_ref<bool(GlobalValue::GUID, const GlobalValueSummary *)>
-          IsPrevailing,
-      const ModuleSummaryIndex &Index,
-      DenseMap<StringRef, FunctionImporter::ExportSetTy> *ExportLists)
-      : ModuleImportsManager(IsPrevailing, Index, ExportLists) {
+  void loadFromJson() {
     // Since the workload def uses names, we need a quick lookup
     // name->ValueInfo.
     StringMap<ValueInfo> NameToValueInfo;
@@ -672,6 +672,66 @@ class WorkloadImportsManager : public ModuleImportsManager {
       });
     }
   }
+
+  void loadFromCtxProf() {
+    std::error_code EC;
+    auto BufferOrErr = MemoryBuffer::getFileOrSTDIN(ContextualProfile);
+    if (std::error_code EC = BufferOrErr.getError()) {
+      report_fatal_error("Failed to open contextual profile file");
+      return;
+    }
+    auto Buffer = std::move(BufferOrErr.get());
+
+    BitstreamCursor Cursor(*Buffer);
+    PGOCtxProfileReader Reader(Cursor);
+    auto Ctx = Reader.loadContexts();
+    if (!Ctx) {
+      report_fatal_error("Failed to parse contextual profiles");
+      return;
+    }
+    const auto &CtxMap = *Ctx;
+    for (const auto &[RootGuid, Root] : CtxMap) {
+      auto RootVI = Index.getValueInfo(RootGuid);
+      if (!RootVI) {
+        LLVM_DEBUG(dbgs() << "[Workload] Root " << RootGuid
+                          << " not found in this linkage unit.\n");
+        continue;
+      }
+      if (RootVI.getSummaryList().size() != 1) {
+        LLVM_DEBUG(dbgs() << "[Workload] Root " << RootGuid
+                          << " should have exactly one summary, but has "
+                          << RootVI.getSummaryList().size() << ". Skipping.\n");
+        continue;
+      }
+      StringRef RootDefiningModule =
+          RootVI.getSummaryList().front()->modulePath();
+      LLVM_DEBUG(dbgs() << "[Workload] Root defining module for " << RootGuid
+                        << " is : " << RootDefiningModule << "\n");
+      DenseSet<GlobalValue::GUID> ContainedGUIDs;
+      auto &Set = Workloads[RootDefiningModule];
+      Root.getContainedGuids(ContainedGUIDs);
+      for (auto Guid : ContainedGUIDs)
+        if (auto VI = Index.getValueInfo(Guid))
+          Set.insert(VI);
+    }
+  }
+
+public:
+  WorkloadImportsManager(
+      function_ref<bool(GlobalValue::GUID, const GlobalValueSummary *)>
+          IsPrevailing,
+      const ModuleSummaryIndex &Index,
+      DenseMap<StringRef, FunctionImporter::ExportSetTy> *ExportLists)
+      : ModuleImportsManager(IsPrevailing, Index, ExportLists) {
+    if (ContextualProfile.empty() == WorkloadDefinitions.empty()) {
+      report_fatal_error(
+          "Pass only one of: -thinlto-pgo-ctx-prof or -thinlto-workload-def");
+      return;
+    }
+    if (!ContextualProfile.empty())
+      loadFromCtxProf();
+    loadFromJson();
+  }
 };
 
 std::unique_ptr<ModuleImportsManager> ModuleImportsManager::create(
@@ -679,7 +739,7 @@ std::unique_ptr<ModuleImportsManager> ModuleImportsManager::create(
         IsPrevailing,
     const ModuleSummaryIndex &Index,
     DenseMap<StringRef, FunctionImporter::ExportSetTy> *ExportLists) {
-  if (WorkloadDefinitions.empty()) {
+  if (WorkloadDefinitions.empty() && ContextualProfile.empty()) {
     LLVM_DEBUG(dbgs() << "[Workload] Using the regular imports manager.\n");
     return std::unique_ptr<ModuleImportsManager>(
         new ModuleImportsManager(IsPrevailing, Index, ExportLists));
diff --git a/llvm/unittests/ProfileData/PGOCtxProfReaderWriterTest.cpp b/llvm/unittests/ProfileData/PGOCtxProfReaderWriterTest.cpp
index d2cdbb28e2fce..6c6798ded00b5 100644
--- a/llvm/unittests/ProfileData/PGOCtxProfReaderWriterTest.cpp
+++ b/llvm/unittests/ProfileData/PGOCtxProfReaderWriterTest.cpp
@@ -115,6 +115,15 @@ TEST_F(PGOCtxProfRWTest, RoundTrip) {
     EXPECT_EQ(Ctxes.size(), 2U);
     for (auto &[G, R] : roots())
       checkSame(*R, Ctxes.find(G)->second);
+
+    DenseSet<GlobalValue::GUID> Guids;
+    Ctxes.at(1U).getContainedGuids(Guids);
+    EXPECT_THAT(Guids,
+                testing::WhenSorted(testing::ElementsAre(1U, 2U, 4U, 5U)));
+
+    Guids.clear();
+    Ctxes.at(3U).getContainedGuids(Guids);
+    EXPECT_THAT(Guids, testing::ElementsAre(3U));
   }
 }
 

@@ -174,6 +174,10 @@ static cl::opt<std::string> WorkloadDefinitions(
"}"),
cl::Hidden);

static cl::opt<std::string>
ContextualProfile("thinlto-pgo-ctx-prof",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test to exercise this handling?

Copy link
Member Author

@mtrofin mtrofin Jul 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't currently have a textual representation of the contextual profile, and the import manager is internal to this file, meaning we can't write a unittest. I'd do either, though, if the new functionality was more involved, but it's really just taking data from the ctx profile reader - which is tested - and filling a set with it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant a lit test. I think it would be good to add (reading the binary format like I am assuming the ctx profile reader tests do?), otherwise this is untested functionality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - need to figure out where to place it, because it would need to be a test that runs a program to generate a profile. Meaning it needs compiler-rt.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Root.getContainedGuids(ContainedGUIDs);
for (auto Guid : ContainedGUIDs)
if (auto VI = Index.getValueInfo(Guid))
Set.insert(VI);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The json case has some debug printing of the eventual Set contents - needed here too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved in a common place.

@@ -115,6 +115,15 @@ TEST_F(PGOCtxProfRWTest, RoundTrip) {
EXPECT_EQ(Ctxes.size(), 2U);
for (auto &[G, R] : roots())
checkSame(*R, Ctxes.find(G)->second);

DenseSet<GlobalValue::GUID> Guids;
Ctxes.at(1U).getContainedGuids(Guids);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unclear to me how this test change is related to the rest of the PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not, but this is the API that the rest of this patch hinges on. I could introduce the test separately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just commit that separately (no need for another review, that part lgtm) since that isn't a new API here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done (a742693)

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with a couple small fixes

; RUN: mkdir -p %t_baseline
; RUN: mkdir -p %t_exp
;
; Normal run. m1 shouldn't get m2_f1 because it's not referenced from there.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and ditto for m2 and m1_f1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup

@@ -174,6 +174,17 @@ static cl::opt<std::string> WorkloadDefinitions(
"}"),
cl::Hidden);

static cl::opt<bool> ImportAssumeUniqueLocal(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change should be removed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is from #100448?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I was looking at diffs from my last review, and wasn't expecting things from other patches to show up (did it get rebased?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I needed to so I could get the llvm-ctxprof-util tool

@mtrofin mtrofin merged commit c99bd3c into llvm:main Jul 29, 2024
5 of 7 checks passed
@mtrofin mtrofin deleted the thinlto branch July 29, 2024 22:06
banach-space pushed a commit to banach-space/llvm-project that referenced this pull request Aug 7, 2024
…ile (llvm#98682)

Keeping the json-based input as it's useful for diagnostics or for driving the import by other means than contextual composition.

The support for the contextual profile is just another modality for constructing the import list (`WorkloadImportsManager::Workloads`).

Everything else - i.e. the actual importing logic - is already independent from how that list was obtained.
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Aug 22, 2024
Reverts: depends upon earlier ctx_prof reverts
c99bd3c [ctx_prof] Extend `WorkloadImportsManager` to use the contextual profile (llvm#98682)

Change-Id: I602defa5d5aa32c272ba831165cb39cf8f83b5ba
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Sep 6, 2024
…ile (llvm#98682)

Keeping the json-based input as it's useful for diagnostics or for driving the import by other means than contextual composition.

The support for the contextual profile is just another modality for constructing the import list (`WorkloadImportsManager::Workloads`).

Everything else - i.e. the actual importing logic - is already independent from how that list was obtained.

Change-Id: I267f13222a7007aeffb690feac00af90e779301a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:transforms LTO Link time optimization (regular/full LTO or ThinLTO) PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants