-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CGData] LLD for MachO #90166
[CGData] LLD for MachO #90166
Conversation
1a96212
to
0b026f3
Compare
This commit introduces support for outlining functions across modules using codegen data generated from previous codegen. The codegen data currently manages the outlined hash tree, which records outlining instances that occurred locally in the past. The machine outliner now operates in one of three modes: 1. CGDataMode::None: This is the default outliner mode that uses the suffix tree to identify (local) outlining candidates within a module. This mode is also used by (full)LTO to maintain optimal behavior with the combined module. 2. CGDataMode::Write (`-codegen-data-generate`): This mode is identical to the default mode, but it also publishes the stable hash sequences of instructions in the outlined functions into a local outlined hash tree. It then encodes this into the `__llvm_outline` section, which will be dead-stripped at link time. 3. CGDataMode::Read (`-codegen-data-use-path={.cgdata}`): This mode reads a codegen data file (.cgdata) and initializes a global outlined hash tree. This tree is used to generate global outlining candidates. Note that the codegen data file has been post-processed with the raw `__llvm_outline` sections from all native objects using the `llvm-cgdata` tool (or a linker, `LLD`, or a new ThinLTO pipeline later). This depends on #105398. After this PR, LLD (#90166) and Clang (#90304) will follow for each client side support. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
fa2f45b
to
c1e6ec8
Compare
@llvm/pr-subscribers-lld-macho @llvm/pr-subscribers-lld Author: Kyungwoo Lee (kyulee-com) ChangesIt reads raw CG data encoded in the custom section (__llvm_outline) in object files and merges them into the indexed codegen data file specified by This depends on #90074. Full diff: https://github.com/llvm/llvm-project/pull/90166.diff 7 Files Affected:
diff --git a/lld/MachO/Config.h b/lld/MachO/Config.h
index 5fca3f16a897ed..8f6da6330d7ad4 100644
--- a/lld/MachO/Config.h
+++ b/lld/MachO/Config.h
@@ -210,6 +210,7 @@ struct Configuration {
std::vector<SectionAlign> sectionAlignments;
std::vector<SegmentProtection> segmentProtections;
bool ltoDebugPassManager = false;
+ llvm::StringRef codegenDataGeneratePath;
bool csProfileGenerate = false;
llvm::StringRef csProfilePath;
bool pgoWarnMismatch;
diff --git a/lld/MachO/Driver.cpp b/lld/MachO/Driver.cpp
index 73b83123f49b4d..e4c908f0b0aaf9 100644
--- a/lld/MachO/Driver.cpp
+++ b/lld/MachO/Driver.cpp
@@ -36,6 +36,7 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/BinaryFormat/MachO.h"
#include "llvm/BinaryFormat/Magic.h"
+#include "llvm/CGData/CodeGenDataWriter.h"
#include "llvm/Config/llvm-config.h"
#include "llvm/LTO/LTO.h"
#include "llvm/Object/Archive.h"
@@ -1322,6 +1323,38 @@ static void gatherInputSections() {
}
}
+static void codegenDataGenerate() {
+ TimeTraceScope timeScope("Generating codegen data");
+
+ OutlinedHashTreeRecord globalOutlineRecord;
+ for (ConcatInputSection *isec : inputSections) {
+ if (isec->getSegName() == segment_names::data &&
+ isec->getName() == section_names::outlinedHashTree) {
+ // Read outlined hash tree from each section
+ OutlinedHashTreeRecord localOutlineRecord;
+ auto *data = isec->data.data();
+ localOutlineRecord.deserialize(data);
+
+ // Merge it to the global hash tree.
+ globalOutlineRecord.merge(localOutlineRecord);
+ }
+ }
+
+ CodeGenDataWriter Writer;
+ if (!globalOutlineRecord.empty())
+ Writer.addRecord(globalOutlineRecord);
+
+ std::error_code EC;
+ auto fileName = config->codegenDataGeneratePath;
+ assert(!fileName.empty());
+ raw_fd_ostream Output(fileName, EC, sys::fs::OF_None);
+ if (EC)
+ error("fail to create raw_fd_ostream");
+
+ if (auto E = Writer.write(Output))
+ error("fail to write CGData");
+}
+
static void foldIdenticalLiterals() {
TimeTraceScope timeScope("Fold identical literals");
// We always create a cStringSection, regardless of whether dedupLiterals is
@@ -1759,6 +1792,8 @@ bool link(ArrayRef<const char *> argsArr, llvm::raw_ostream &stdoutOS,
config->ignoreAutoLinkOptions.insert(arg->getValue());
config->strictAutoLink = args.hasArg(OPT_strict_auto_link);
config->ltoDebugPassManager = args.hasArg(OPT_lto_debug_pass_manager);
+ config->codegenDataGeneratePath =
+ args.getLastArgValue(OPT_codegen_data_generate_path);
config->csProfileGenerate = args.hasArg(OPT_cs_profile_generate);
config->csProfilePath = args.getLastArgValue(OPT_cs_profile_path);
config->pgoWarnMismatch =
@@ -2103,6 +2138,10 @@ bool link(ArrayRef<const char *> argsArr, llvm::raw_ostream &stdoutOS,
}
gatherInputSections();
+
+ if (!config->codegenDataGeneratePath.empty())
+ codegenDataGenerate();
+
if (config->callGraphProfileSort)
priorityBuilder.extractCallGraphProfile();
diff --git a/lld/MachO/InputSection.h b/lld/MachO/InputSection.h
index 4e238d8ef77793..7ef0e31066f372 100644
--- a/lld/MachO/InputSection.h
+++ b/lld/MachO/InputSection.h
@@ -354,6 +354,7 @@ constexpr const char objcMethname[] = "__objc_methname";
constexpr const char objcNonLazyCatList[] = "__objc_nlcatlist";
constexpr const char objcNonLazyClassList[] = "__objc_nlclslist";
constexpr const char objcProtoList[] = "__objc_protolist";
+constexpr const char outlinedHashTree[] = "__llvm_outline";
constexpr const char pageZero[] = "__pagezero";
constexpr const char pointers[] = "__pointers";
constexpr const char rebase[] = "__rebase";
diff --git a/lld/MachO/Options.td b/lld/MachO/Options.td
index cbd28bbceef786..8c5c6a853f3cc7 100644
--- a/lld/MachO/Options.td
+++ b/lld/MachO/Options.td
@@ -162,6 +162,8 @@ def no_objc_category_merging : Flag<["-"], "no_objc_category_merging">,
Group<grp_lld>;
def lto_debug_pass_manager: Flag<["--"], "lto-debug-pass-manager">,
HelpText<"Debug new pass manager">, Group<grp_lld>;
+def codegen_data_generate_path : Joined<["--"], "codegen-data-generate-path=">,
+ HelpText<"Codegen data file path">, Group<grp_lld>;
def cs_profile_generate: Flag<["--"], "cs-profile-generate">,
HelpText<"Perform context sensitive PGO instrumentation">, Group<grp_lld>;
def cs_profile_path: Joined<["--"], "cs-profile-path=">,
diff --git a/lld/test/CMakeLists.txt b/lld/test/CMakeLists.txt
index 5d4a2757c529b8..abc8ea75da180b 100644
--- a/lld/test/CMakeLists.txt
+++ b/lld/test/CMakeLists.txt
@@ -48,6 +48,7 @@ if (NOT LLD_BUILT_STANDALONE)
llvm-ar
llvm-as
llvm-bcanalyzer
+ llvm-cgdata
llvm-config
llvm-cvtres
llvm-dis
diff --git a/lld/test/MachO/cgdata-generate.s b/lld/test/MachO/cgdata-generate.s
new file mode 100644
index 00000000000000..0f91070dd9cc0c
--- /dev/null
+++ b/lld/test/MachO/cgdata-generate.s
@@ -0,0 +1,94 @@
+# UNSUPPORTED: system-windows
+# REQUIRES: aarch64
+
+# RUN: rm -rf %t; split-file %s %t
+
+# Synthesize raw cgdata without the header (24 byte) from the indexed cgdata.
+# RUN: llvm-cgdata --convert --format binary %t/raw-1.cgtext -o %t/raw-1.cgdata
+# RUN: od -t x1 -j 24 -An %t/raw-1.cgdata | tr -d '\n\r\t' | sed 's/ \+/ /g; s/^ *//; s/ *$//; s/ /,0x/g; s/^/0x/' > %t/raw-1-bytes.txt
+# RUN: sed "s/<RAW_1_BYTES>/$(cat %t/raw-1-bytes.txt)/g" %t/merge-1-template.s > %t/merge-1.s
+# RUN: llvm-cgdata --convert --format binary %t/raw-2.cgtext -o %t/raw-2.cgdata
+# RUN: od -t x1 -j 24 -An %t/raw-2.cgdata | tr -d '\n\r\t' | sed 's/ \+/ /g; s/^ *//; s/ *$//; s/ /,0x/g; s/^/0x/' > %t/raw-2-bytes.txt
+# RUN: sed "s/<RAW_2_BYTES>/$(cat %t/raw-2-bytes.txt)/g" %t/merge-2-template.s > %t/merge-2.s
+
+# RUN: llvm-mc -filetype obj -triple arm64-apple-darwin %t/merge-1.s -o %t/merge-1.o
+# RUN: llvm-mc -filetype obj -triple arm64-apple-darwin %t/merge-2.s -o %t/merge-2.o
+# RUN: llvm-mc -filetype obj -triple arm64-apple-darwin %t/main.s -o %t/main.o
+
+# This checks if the codegen data from the linker is identical to the merged codegen data
+# from each object file, which is obtained using the llvm-cgdata tool.
+# RUN: %no-arg-lld -dylib -arch arm64 -platform_version ios 14.0 15.0 -o %t/out \
+# RUN: %t/merge-1.o %t/merge-2.o %t/main.o --codegen-data-generate-path=%t/out-cgdata
+# RUN: llvm-cgdata --merge %t/merge-1.o %t/merge-2.o %t/main.o -o %t/merge-cgdata
+# RUN: diff %t/out-cgdata %t/merge-cgdata
+
+# Merge order doesn't matter. `main.o` is dropped due to missing __llvm_outline.
+# RUN: llvm-cgdata --merge %t/merge-2.o %t/merge-1.o -o %t/merge-cgdata-shuffle
+# RUN: diff %t/out-cgdata %t/merge-cgdata-shuffle
+
+# We can also generate the merged codegen data from the executable that is not dead-stripped.
+# RUN: llvm-objdump -h %t/out| FileCheck %s
+CHECK: __llvm_outline
+# RUN: llvm-cgdata --merge %t/out -o %t/merge-cgdata-exe
+# RUN: diff %t/merge-cgdata-exe %t/merge-cgdata
+
+# Dead-strip will remove __llvm_outline sections from the final executable.
+# But the codeden data is still correctly produced from the linker.
+# RUN: %no-arg-lld -dylib -arch arm64 -platform_version ios 14.0 15.0 -o %t/out-strip \
+# RUN: %t/merge-1.o %t/merge-2.o %t/main.o -dead_strip --codegen-data-generate-path=%t/out-cgdata-strip
+# RUN: llvm-cgdata --merge %t/merge-1.o %t/merge-2.o %t/main.o -o %t/merge-cgdata-strip
+# RUN: diff %t/out-cgdata-strip %t/merge-cgdata-strip
+# RUN: diff %t/out-cgdata-strip %t/merge-cgdata
+
+# Ensure no __llvm_outline section remains in the executable.
+# RUN: llvm-objdump -h %t/out-strip | FileCheck %s --check-prefix=STRIP
+STRIP-NOT: __llvm_outline
+
+#--- raw-1.cgtext
+:outlined_hash_tree
+0:
+ Hash: 0x0
+ Terminals: 0
+ SuccessorIds: [ 1 ]
+1:
+ Hash: 0x1
+ Terminals: 0
+ SuccessorIds: [ 2 ]
+2:
+ Hash: 0x2
+ Terminals: 4
+ SuccessorIds: [ ]
+...
+
+#--- merge-1-template.s
+.section __DATA,__llvm_outline
+_data:
+.byte <RAW_1_BYTES>
+
+#--- raw-2.cgtext
+:outlined_hash_tree
+0:
+ Hash: 0x0
+ Terminals: 0
+ SuccessorIds: [ 1 ]
+1:
+ Hash: 0x1
+ Terminals: 0
+ SuccessorIds: [ 2 ]
+2:
+ Hash: 0x3
+ Terminals: 5
+ SuccessorIds: [ ]
+...
+
+#--- merge-2-template.s
+.section __DATA,__llvm_outline
+_data:
+.byte <RAW_2_BYTES>
+
+#--- main.s
+.globl _main
+
+.text
+_main:
+ ret
diff --git a/lld/test/lit.cfg.py b/lld/test/lit.cfg.py
index d309c2ad4ee284..859094e2b57dbc 100644
--- a/lld/test/lit.cfg.py
+++ b/lld/test/lit.cfg.py
@@ -40,6 +40,7 @@
tool_patterns = [
"llc",
"llvm-as",
+ "llvm-cgdata",
"llvm-mc",
"llvm-nm",
"llvm-objdump",
|
This commit introduces support for outlining functions across modules using codegen data generated from previous codegen. The codegen data currently manages the outlined hash tree, which records outlining instances that occurred locally in the past. The machine outliner now operates in one of three modes: 1. CGDataMode::None: This is the default outliner mode that uses the suffix tree to identify (local) outlining candidates within a module. This mode is also used by (full)LTO to maintain optimal behavior with the combined module. 2. CGDataMode::Write (`-codegen-data-generate`): This mode is identical to the default mode, but it also publishes the stable hash sequences of instructions in the outlined functions into a local outlined hash tree. It then encodes this into the `__llvm_outline` section, which will be dead-stripped at link time. 3. CGDataMode::Read (`-codegen-data-use-path={.cgdata}`): This mode reads a codegen data file (.cgdata) and initializes a global outlined hash tree. This tree is used to generate global outlining candidates. Note that the codegen data file has been post-processed with the raw `__llvm_outline` sections from all native objects using the `llvm-cgdata` tool (or a linker, `LLD`, or a new ThinLTO pipeline later). This depends on llvm#105398. After this PR, LLD (llvm#90166) and Clang (llvm#90304) will follow for each client side support. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
https://clang.llvm.org/docs/UsersManual.html#profiling-with-instrumentation Do you think we should support something similar given that we can pass a directory to the clang frontend flag in #90304? If we build several binaries, we still want to disambiguate the |
Thank you for the suggestion! I largely copied the Clang flags from IRPGO. Unlike IRPGO, which needs to handle many raw profile data files emitted by runtimes, here we deal with a single codegen data file at build time, similar to how a linker handles a single executable file. Therefore, I think the build system can easily configure this additional flag with the correct path for different binaries. So, I don't think there is a need to support a directory path or these modifiers for simplicity. I'm considering simplifying the Clang flag to avoid this confusion. What do you think? |
I see. If we don't support writing profiles to a directory, then I don't think it's necessary to support these modifiers. I do, however, see the value in writing to a default file ( |
I already updated #90304 accordingly to remove a directory support while stilling setting a default file ( |
Can someone take another look? I think I've addressed the comments and this code should be straightforward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
lld/MachO/Driver.cpp
Outdated
TimeTraceScope timeScope("Generating codegen data"); | ||
|
||
OutlinedHashTreeRecord globalOutlineRecord; | ||
for (ConcatInputSection *isec : inputSections) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: You can remove the braces in the for loop. Though in practice, I've seen both conventions be used here. So up to you if you think that helps with readability.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/5762 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/151/builds/2294 Here is the relevant piece of the build log for the reference
|
This reverts commit 00c0b1a.
It reads raw CG data encoded in the custom section (__llvm_outline) in object files and merges them into the indexed codegen data file specified by `-codegen-data-generate-path={path}`. This depends on llvm#90074. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
It reads raw CG data encoded in the custom section (__llvm_outline) in object files and merges them into the indexed codegen data file specified by -codegen-data-generate-path={path}. This depends on #90074. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
It reads raw CG data encoded in the custom section (__llvm_outline) in object files and merges them into the indexed codegen data file specified by
-codegen-data-generate-path={path}
.This depends on #90074.
This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.