Skip to content

Commit

Permalink
Merge branch 'main' into hgh/libcxx/P2591R5-Concatenation-of-string-a…
Browse files Browse the repository at this point in the history
…nd-string-views
  • Loading branch information
H-G-Hristov authored Jul 16, 2024
2 parents d799472 + 48f55ba commit 0820ad9
Show file tree
Hide file tree
Showing 1,063 changed files with 77,887 additions and 28,104 deletions.
14 changes: 7 additions & 7 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,11 @@ clang/test/AST/Interp/ @tbaederr
/mlir/include/mlir/Dialect/Linalg @dcaballe @nicolasvasilache @rengolin
/mlir/lib/Dialect/Linalg @dcaballe @nicolasvasilache @rengolin
/mlir/lib/Dialect/Linalg/Transforms/DecomposeLinalgOps.cpp @MaheshRavishankar @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/DropUnitDims.cpp @MaheshRavishankar @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/DropUnitDims.cpp @dcaballe @MaheshRavishankar @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/ElementwiseOpFusion.cpp @MaheshRavishankar @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/DataLayoutPropagation.cpp @hanhanW @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp @hanhanW @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp @hanhanW @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp @dcaballe @hanhanW @nicolasvasilache
/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp @banach-space @dcaballe @hanhanW @nicolasvasilache

# MemRef Dialect in MLIR.
/mlir/lib/Dialect/MemRef/Transforms/EmulateNarrowType.cpp @MaheshRavishankar @nicolasvasilache
Expand All @@ -85,10 +85,10 @@ clang/test/AST/Interp/ @tbaederr
/mlir/**/*VectorToSCF* @banach-space @dcaballe @matthias-springer @nicolasvasilache
/mlir/**/*VectorToLLVM* @banach-space @dcaballe @nicolasvasilache
/mlir/**/*X86Vector* @aartbik @dcaballe @nicolasvasilache
/mlir/include/mlir/Dialect/Vector @dcaballe @nicolasvasilache
/mlir/lib/Dialect/Vector @dcaballe @nicolasvasilache
/mlir/lib/Dialect/Vector/Transforms/* @hanhanW @nicolasvasilache
/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp @MaheshRavishankar @nicolasvasilache
/mlir/include/mlir/Dialect/Vector @banach-space @dcaballe @nicolasvasilache
/mlir/lib/Dialect/Vector @banach-space @dcaballe @nicolasvasilache
/mlir/lib/Dialect/Vector/Transforms/* @banach-space @dcaballe @hanhanW @nicolasvasilache
/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp @banach-space @dcaballe @MaheshRavishankar @nicolasvasilache
/mlir/**/*EmulateNarrowType* @dcaballe @hanhanW

# Presburger library in MLIR
Expand Down
8 changes: 7 additions & 1 deletion bolt/docs/CommandLineArgumentReference.md
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,12 @@

List of functions to pad with amount of bytes

- `--print-mappings`

Print mappings in the legend, between characters/blocks and text sections
(default false).


- `--profile-format=<value>`

Format to dump profile output in aggregation mode, default is fdata
Expand Down Expand Up @@ -1240,4 +1246,4 @@

- `--print-options`

Print non-default options after command line parsing
Print non-default options after command line parsing
Binary file added bolt/docs/HeatmapHeader.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
68 changes: 56 additions & 12 deletions bolt/docs/Heatmaps.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Code Heatmaps

BOLT has gained the ability to print code heatmaps based on
sampling-based LBR profiles generated by `perf`. The output is produced
in colored ASCII to be displayed in a color-capable terminal. It looks
something like this:
sampling-based profiles generated by `perf`, either with `LBR` data or not.
The output is produced in colored ASCII to be displayed in a color-capable
terminal. It looks something like this:

![](./Heatmap.png)

Expand Down Expand Up @@ -32,20 +32,64 @@ $ llvm-bolt-heatmap -p perf.data <executable>
```

By default the heatmap will be dumped to *stdout*. You can change it
with `-o <heatmapfile>` option. Each character/block in the heatmap
shows the execution data accumulated for corresponding 64 bytes of
code. You can change this granularity with a `-block-size` option.
E.g. set it to 4096 to see code usage grouped by 4K pages.
Other useful options are:
with `-o <heatmapfile>` option.

```bash
-line-size=<uint> - number of entries per line (default 256)
-max-address=<uint> - maximum address considered valid for heatmap (default 4GB)
```

If you prefer to look at the data in a browser (or would like to share
it that way), then you can use an HTML conversion tool. E.g.:

```bash
$ aha -b -f <heatmapfile> > <heatmapfile>.html
```

---

## Background on heatmaps:
A heatmap is effectively a histogram that is rendered into a grid for better
visualization.
In theory we can generate a heatmap using any binary and a perf profile.

Each block/character in the heatmap shows the execution data accumulated for
corresponding 64 bytes of code. You can change this granularity with a
`-block-size` option.
E.g. set it to 4096 to see code usage grouped by 4K pages.


When a block is shown as a dot, it means that no samples were found for that
address.
When it is shown as a letter, it indicates a captured sample on a particular
text section of the binary.
To show a mapping between letters and text sections in the legend, use
`-print-mappings`.
When a sampled address does not belong to any of the text sections, the
characters 'o' or 'O' will be shown.

The legend shows by default the ranges in the heatmap according to the number
of samples per block.
A color is assigned per range, except the first two ranges that distinguished by
lower and upper case letters.

On the Y axis, each row/line starts with an actual address of the binary.
Consecutive lines in the heatmap advance by the same amount, with the binary
size covered by a line dependent on the block size and the line size.
An empty new line is inserted for larger gaps between samples.

On the X axis, the horizontally emitted hex numbers can help *estimate* where
in the line the samples lie, but they cannot be combined to provide a full
address, as they are relative to both the bucket and line sizes.

In the example below, the highlighted `0x100` column is not an offset to each
row's address, but instead, it points to the middle of the line.
For the generation, the default bucket size was used with a line size of 128.


![](./HeatmapHeader.png)


Some useful options are:

```
-line-size=<uint> - number of entries per line (default 256)
-max-address=<uint> - maximum address considered valid for heatmap (default 4GB)
-print-mappings - print mappings in the legend, between characters/blocks and text sections (default false)
```
64 changes: 44 additions & 20 deletions bolt/include/bolt/Core/DebugData.h
Original file line number Diff line number Diff line change
Expand Up @@ -256,7 +256,7 @@ class DebugRangeListsSectionWriter : public DebugRangesSectionWriter {
};
virtual ~DebugRangeListsSectionWriter(){};

static void setAddressWriter(DebugAddrWriter *AddrW) { AddrWriter = AddrW; }
void setAddressWriter(DebugAddrWriter *AddrW) { AddrWriter = AddrW; }

/// Add ranges with caching.
uint64_t addRanges(
Expand Down Expand Up @@ -284,7 +284,7 @@ class DebugRangeListsSectionWriter : public DebugRangesSectionWriter {
}

private:
static DebugAddrWriter *AddrWriter;
DebugAddrWriter *AddrWriter = nullptr;
/// Used to find unique CU ID.
DWARFUnit *CU;
/// Current relative offset of range list entry within this CUs rangelist
Expand Down Expand Up @@ -336,21 +336,36 @@ using AddressSectionBuffer = SmallVector<char, 4>;
class DebugAddrWriter {
public:
DebugAddrWriter() = delete;
DebugAddrWriter(BinaryContext *BC_);
DebugAddrWriter(BinaryContext *BC_) : DebugAddrWriter(BC_, UCHAR_MAX) {};
DebugAddrWriter(BinaryContext *BC_, uint8_t AddressByteSize);
virtual ~DebugAddrWriter(){};
/// Given an address returns an index in .debug_addr.
/// Adds Address to map.
uint32_t getIndexFromAddress(uint64_t Address, DWARFUnit &CU);

/// Write out entries in to .debug_addr section for CUs.
virtual void update(DIEBuilder &DIEBlder, DWARFUnit &CUs);
virtual std::optional<uint64_t> finalize(const size_t BufferSize);

/// Return buffer with all the entries in .debug_addr already writen out using
/// update(...).
virtual AddressSectionBuffer &finalize() { return *Buffer; }
virtual std::unique_ptr<AddressSectionBuffer> releaseBuffer() {
return std::move(Buffer);
}

/// Returns buffer size.
virtual size_t getBufferSize() const { return Buffer->size(); }

/// Returns True if Buffer is not empty.
bool isInitialized() const { return !Buffer->empty(); }

/// Returns False if .debug_addr section was created..
bool isInitialized() const { return !AddressMaps.empty(); }
/// Updates address base with the given Offset.
virtual void updateAddrBase(DIEBuilder &DIEBlder, DWARFUnit &CU,
const uint64_t Offset);

/// Appends an AddressSectionBuffer to the address writer's buffer.
void appendToAddressBuffer(const AddressSectionBuffer &Buffer) {
*AddressStream << Buffer;
}

protected:
class AddressForDWOCU {
Expand Down Expand Up @@ -407,23 +422,32 @@ class DebugAddrWriter {
}

BinaryContext *BC;
/// Maps DWOID to AddressForDWOCU.
std::unordered_map<uint64_t, AddressForDWOCU> AddressMaps;
/// Address for the DWO CU associated with the address writer.
AddressForDWOCU Map;
uint8_t AddressByteSize;
/// Mutex used for parallel processing of debug info.
std::mutex WriterMutex;
std::unique_ptr<AddressSectionBuffer> Buffer;
std::unique_ptr<raw_svector_ostream> AddressStream;
/// Used to track sections that were not modified so that they can be re-used.
DenseMap<uint64_t, uint64_t> UnmodifiedAddressOffsets;
static DenseMap<uint64_t, uint64_t> UnmodifiedAddressOffsets;
};

class DebugAddrWriterDwarf5 : public DebugAddrWriter {
public:
DebugAddrWriterDwarf5() = delete;
DebugAddrWriterDwarf5(BinaryContext *BC) : DebugAddrWriter(BC) {}
DebugAddrWriterDwarf5(BinaryContext *BC, uint8_t AddressByteSize,
std::optional<uint64_t> AddrOffsetSectionBase)
: DebugAddrWriter(BC, AddressByteSize),
AddrOffsetSectionBase(AddrOffsetSectionBase) {}

/// Write out entries in to .debug_addr section for CUs.
virtual void update(DIEBuilder &DIEBlder, DWARFUnit &CUs) override;
virtual std::optional<uint64_t> finalize(const size_t BufferSize) override;

/// Updates address base with the given Offset.
virtual void updateAddrBase(DIEBuilder &DIEBlder, DWARFUnit &CU,
const uint64_t Offset) override;

protected:
/// Given DWARFUnit \p Unit returns either DWO ID or it's offset within
Expand All @@ -435,6 +459,10 @@ class DebugAddrWriterDwarf5 : public DebugAddrWriter {
}
return Unit.getOffset();
}

private:
std::optional<uint64_t> AddrOffsetSectionBase = std::nullopt;
static constexpr uint32_t HeaderSize = 8;
};

/// This class is NOT thread safe.
Expand Down Expand Up @@ -583,12 +611,10 @@ class DebugLoclistWriter : public DebugLocWriter {
public:
~DebugLoclistWriter() {}
DebugLoclistWriter() = delete;
DebugLoclistWriter(DWARFUnit &Unit, uint8_t DV, bool SD)
: DebugLocWriter(DV, LocWriterKind::DebugLoclistWriter), CU(Unit),
IsSplitDwarf(SD) {
assert(DebugLoclistWriter::AddrWriter &&
"Please use SetAddressWriter to initialize "
"DebugAddrWriter before instantiation.");
DebugLoclistWriter(DWARFUnit &Unit, uint8_t DV, bool SD,
DebugAddrWriter &AddrW)
: DebugLocWriter(DV, LocWriterKind::DebugLoclistWriter),
AddrWriter(AddrW), CU(Unit), IsSplitDwarf(SD) {
if (DwarfVersion >= 5) {
LocBodyBuffer = std::make_unique<DebugBufferVector>();
LocBodyStream = std::make_unique<raw_svector_ostream>(*LocBodyBuffer);
Expand All @@ -600,8 +626,6 @@ class DebugLoclistWriter : public DebugLocWriter {
}
}

static void setAddressWriter(DebugAddrWriter *AddrW) { AddrWriter = AddrW; }

/// Stores location lists internally to be written out during finalize phase.
virtual void addList(DIEBuilder &DIEBldr, DIE &Die, DIEValue &AttrInfo,
DebugLocationsVector &LocList) override;
Expand Down Expand Up @@ -630,7 +654,7 @@ class DebugLoclistWriter : public DebugLocWriter {
/// Writes out locations in to a local buffer and applies debug info patches.
void finalizeDWARF5(DIEBuilder &DIEBldr, DIE &Die);

static DebugAddrWriter *AddrWriter;
DebugAddrWriter &AddrWriter;
DWARFUnit &CU;
bool IsSplitDwarf{false};
// Used for DWARF5 to store location lists before being finalized.
Expand Down
11 changes: 8 additions & 3 deletions bolt/include/bolt/Core/MCPlusBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -2041,9 +2041,14 @@ class MCPlusBuilder {
return InstructionListType();
}

virtual InstructionListType createDummyReturnFunction(MCContext *Ctx) const {
llvm_unreachable("not implemented");
return InstructionListType();
/// Returns a function body that contains only a return instruction. An
/// example usage is a workaround for the '__bolt_fini_trampoline' of
// Instrumentation.
virtual InstructionListType
createReturnInstructionList(MCContext *Ctx) const {
InstructionListType Insts(1);
createReturn(Insts[0]);
return Insts;
}

/// This method takes an indirect call instruction and splits it up into an
Expand Down
15 changes: 9 additions & 6 deletions bolt/include/bolt/Rewrite/DWARFRewriter.h
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,6 @@ class DWARFRewriter {
/// .debug_aranges DWARF section.
std::unique_ptr<DebugARangesSectionWriter> ARangesSectionWriter;

/// Stores and serializes information that will be put into the
/// .debug_addr DWARF section.
std::unique_ptr<DebugAddrWriter> AddrWriter;

/// Stores and serializes information that will be put in to the
/// .debug_addr DWARF section.
/// Does not do de-duplication.
Expand All @@ -93,6 +89,10 @@ class DWARFRewriter {
std::unordered_map<uint64_t, std::unique_ptr<DebugRangesSectionWriter>>
LegacyRangesWritersByCU;

/// Stores address writer for each CU.
std::unordered_map<uint64_t, std::unique_ptr<DebugAddrWriter>>
AddressWritersByCU;

std::mutex LocListDebugInfoPatchesMutex;

/// Dwo id specific its RangesBase.
Expand All @@ -115,6 +115,7 @@ class DWARFRewriter {
void updateUnitDebugInfo(DWARFUnit &Unit, DIEBuilder &DIEBldr,
DebugLocWriter &DebugLocWriter,
DebugRangesSectionWriter &RangesSectionWriter,
DebugAddrWriter &AddressWriter,
std::optional<uint64_t> RangesBase = std::nullopt);

/// Patches the binary for an object's address ranges to be updated.
Expand All @@ -141,13 +142,15 @@ class DWARFRewriter {
/// Process and write out CUs that are passsed in.
void finalizeCompileUnits(DIEBuilder &DIEBlder, DIEStreamer &Streamer,
CUOffsetMap &CUMap,
const std::list<DWARFUnit *> &CUs);
const std::list<DWARFUnit *> &CUs,
DebugAddrWriter &FinalAddrWriter);

/// Finalize debug sections in the main binary.
void finalizeDebugSections(DIEBuilder &DIEBlder,
DWARF5AcceleratorTable &DebugNamesTable,
DIEStreamer &Streamer, raw_svector_ostream &ObjOS,
CUOffsetMap &CUMap);
CUOffsetMap &CUMap,
DebugAddrWriter &FinalAddrWriter);

/// Patches the binary for DWARF address ranges (e.g. in functions and lexical
/// blocks) to be updated.
Expand Down
1 change: 1 addition & 0 deletions bolt/include/bolt/Utils/CommandLineOpts.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ extern llvm::cl::opt<unsigned> ExecutionCountThreshold;
extern llvm::cl::opt<unsigned> HeatmapBlock;
extern llvm::cl::opt<unsigned long long> HeatmapMaxAddress;
extern llvm::cl::opt<unsigned long long> HeatmapMinAddress;
extern llvm::cl::opt<bool> HeatmapPrintMappings;
extern llvm::cl::opt<bool> HotData;
extern llvm::cl::opt<bool> HotFunctionsAtEnd;
extern llvm::cl::opt<bool> HotText;
Expand Down
20 changes: 18 additions & 2 deletions bolt/lib/Core/DIEBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -556,7 +556,17 @@ DWARFDie DIEBuilder::resolveDIEReference(
const DWARFAbbreviationDeclaration::AttributeSpec AttrSpec,
DWARFUnit *&RefCU, DWARFDebugInfoEntry &DwarfDebugInfoEntry) {
assert(RefValue.isFormClass(DWARFFormValue::FC_Reference));
uint64_t RefOffset = *RefValue.getAsReference();
uint64_t RefOffset;
if (std::optional<uint64_t> Off = RefValue.getAsRelativeReference()) {
RefOffset = RefValue.getUnit()->getOffset() + *Off;
} else if (Off = RefValue.getAsDebugInfoReference(); Off) {
RefOffset = *Off;
} else {
BC.errs()
<< "BOLT-WARNING: [internal-dwarf-error]: unsupported reference type: "
<< FormEncodingString(RefValue.getForm()) << ".\n";
return DWARFDie();
}
return resolveDIEReference(AttrSpec, RefOffset, RefCU, DwarfDebugInfoEntry);
}

Expand Down Expand Up @@ -607,7 +617,13 @@ void DIEBuilder::cloneDieReferenceAttribute(
DIE &Die, const DWARFUnit &U, const DWARFDie &InputDIE,
const DWARFAbbreviationDeclaration::AttributeSpec AttrSpec,
const DWARFFormValue &Val) {
const uint64_t Ref = *Val.getAsReference();
uint64_t Ref;
if (std::optional<uint64_t> Off = Val.getAsRelativeReference())
Ref = Val.getUnit()->getOffset() + *Off;
else if (Off = Val.getAsDebugInfoReference(); Off)
Ref = *Off;
else
return;

DIE *NewRefDie = nullptr;
DWARFUnit *RefUnit = nullptr;
Expand Down
Loading

0 comments on commit 0820ad9

Please sign in to comment.