Skip to content

Commit

Permalink
[CGData][MachineOutliner] Global Outlining2
Browse files Browse the repository at this point in the history
This commit introduces support for outlining functions across modules using codegen data generated from previous codegen. The codegen data currently manages the outlined hash tree, which records outlining instances that occurred locally in the past.

The machine outliner now operates in one of three modes:
1. CGDataMode::None: This is the default outliner mode that uses the suffix tree to identify (local) outlining candidates within a module. This mode is also used by (full)LTO to maintain optimal behavior with the combined module.
2. CGDataMode::Write (`codegen-data-generate`): This mode is identical to the default mode, but it also publishes the stable hash sequences of instructions in the outlined functions into a local outlined hash tree. It then encodes this into the `__llvm_outline` section, which will be dead-stripped at link time.
3. CGDataMode::Read (`codegen-data-use-path={.cgdata}`): This mode reads a codegen data file (.cgdata) and initializes a global outlined hash tree. This tree is used to generate global outlining candidates. Note that the codegen data file has been post-processed with the raw `__llvm_outline` sections from all native objects using the `llvm-cgdata` tool (or a linker, `LLD`, or a new ThinLTO pipeline later).
  • Loading branch information
kyulee-com committed Aug 22, 2024
1 parent 3ed5913 commit cb23c67
Show file tree
Hide file tree
Showing 14 changed files with 750 additions and 19 deletions.
36 changes: 36 additions & 0 deletions llvm/include/llvm/CodeGen/MachineOutliner.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#include "llvm/CodeGen/LiveRegUnits.h"
#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/MachineStableHash.h"
#include <initializer_list>

namespace llvm {
Expand Down Expand Up @@ -274,6 +275,41 @@ struct OutlinedFunction {
OutlinedFunction() = delete;
virtual ~OutlinedFunction() = default;
};

/// The information necessary to create an outlined function that is matched
/// globally.
struct GlobalOutlinedFunction : public OutlinedFunction {
GlobalOutlinedFunction(OutlinedFunction &OF, unsigned GlobalOccurrenceCount)
: OutlinedFunction(OF.Candidates, OF.SequenceSize, OF.FrameOverhead,
OF.FrameConstructionID),
GlobalOccurrenceCount(GlobalOccurrenceCount) {}

unsigned GlobalOccurrenceCount;

/// Return the number of times that appear globally.
/// Global outlining candidate is uniquely created per each match, but this
/// might be erased out when it's overlapped with the previous outlining
/// instance.
unsigned getOccurrenceCount() const override {
assert(Candidates.size() <= 1);
return Candidates.empty() ? 0 : GlobalOccurrenceCount;
}

/// Return the outlining cost using the global occurrence count
/// with the same cost as the first (unique) candidate.
unsigned getOutliningCost() const override {
assert(Candidates.size() <= 1);
unsigned CallOverhead =
Candidates.empty()
? 0
: Candidates[0].getCallOverhead() * getOccurrenceCount();
return CallOverhead + SequenceSize + FrameOverhead;
}

GlobalOutlinedFunction() = delete;
~GlobalOutlinedFunction() = default;
};

} // namespace outliner
} // namespace llvm

Expand Down
26 changes: 25 additions & 1 deletion llvm/lib/CGData/CodeGenData.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,13 @@
using namespace llvm;
using namespace cgdata;

cl::opt<bool>
CodeGenDataGenerate("codegen-data-generate", cl::init(false), cl::Hidden,
cl::desc("Emit CodeGen Data into custom sections"));
cl::opt<std::string>
CodeGenDataUsePath("codegen-data-use-path", cl::init(""), cl::Hidden,
cl::desc("File path to where .cgdata file is read"));

static std::string getCGDataErrString(cgdata_error Err,
const std::string &ErrMsg = "") {
std::string Msg;
Expand Down Expand Up @@ -132,7 +139,24 @@ CodeGenData &CodeGenData::getInstance() {
std::call_once(CodeGenData::OnceFlag, []() {
Instance = std::unique_ptr<CodeGenData>(new CodeGenData());

// TODO: Initialize writer or reader mode for the client optimization.
if (CodeGenDataGenerate)
Instance->EmitCGData = true;
else if (!CodeGenDataUsePath.empty()) {
// Initialize the global CGData if the input file name is given.
// We do not error-out when failing to parse the input file.
// Instead, just emit an warning message and fall back as if no CGData
// were available.
auto FS = vfs::getRealFileSystem();
auto ReaderOrErr = CodeGenDataReader::create(CodeGenDataUsePath, *FS);
if (Error E = ReaderOrErr.takeError()) {
warn(std::move(E), CodeGenDataUsePath);
return;
}
// Publish each CGData based on the data type in the header.
auto Reader = ReaderOrErr->get();
if (Reader->hasOutlinedHashTree())
Instance->publishOutlinedHashTree(Reader->releaseOutlinedHashTree());
}
});
return *(Instance.get());
}
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/CodeGen/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,7 @@ add_llvm_component_library(LLVMCodeGen
Analysis
BitReader
BitWriter
CGData
CodeGenTypes
Core
MC
Expand Down
Loading

0 comments on commit cb23c67

Please sign in to comment.