Skip to content

Commit

Permalink
[Triton] Generate local MLIR reproducers when possible (#5155)
Browse files Browse the repository at this point in the history
By setting a reproducer path, the pass manager will dump a standard MLIR
reproducer before each pass manager invocation. This PR also enables
additional local crash reproducer generation (to the same path set
through the env var), which tries to narrow down the specific pass that
failed, if the pass pipeline fails at any point.
  • Loading branch information
Mogball authored Nov 14, 2024
1 parent 8bf3ae9 commit d5e06fe
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 0 deletions.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,9 @@ For detailed instructions on how to debug Triton's frontend, please refer to thi
kernels. Use `MLIR_ENABLE_DUMP=kernelName` to dump for a specific kernel only.
- Triton cache can interfere with the dump. In cases where `MLIR_ENABLE_DUMP=1` does not work, try cleaning your triton cache: `rm -r ~/.triton/cache/*`
- `LLVM_IR_ENABLE_DUMP=1` dumps the IR before every pass run over the LLVM IR.
- `TRITON_REPRODUCER_PATH=<reproducer_path>` will generate an MLIR reproducer file
at `<reproducer_path>` before each MLIR compiler stage. If any of the stages fail,
`<reproducer_path>` will be a local MLIR reproducer captured right before the failing pass.
- `TRITON_INTERPRET=1` uses the Triton interpreter instead of running on the
GPU. You can insert Python breakpoints in your kernel code!
- `TRITON_ENABLE_LLVM_DEBUG=1` passes `-debug` to LLVM, printing a lot of
Expand Down
13 changes: 13 additions & 0 deletions python/src/ir.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1707,7 +1707,14 @@ void init_triton_ir(py::module &&m) {
auto anchorName = self.getOpAnchorName();
auto passes = self.getPasses();
Operation *op = mod.getOperation();
// Save a reproducer for the current pass manager invocation
// immediately.
makeReproducer(anchorName, passes, op, reproducerPath);
// But if the pass manager crashes, attempt to generate a local
// reproducer instead.
mod.getContext()->disableMultithreading();
self.enableCrashReproducerGeneration(reproducerPath,
/*genLocalReproducer=*/true);
}

if (triton::tools::getBoolEnv("TRITON_ENABLE_LLVM_DEBUG")) {
Expand Down Expand Up @@ -1740,6 +1747,12 @@ void init_triton_ir(py::module &&m) {
self.enableTiming();
}

// Run the pass manager under a source manager diagnostic handler, which
// enables emitted MLIR diagnostics to directly reference Python source
// code.
llvm::SourceMgr sourceMgr;
SourceMgrDiagnosticHandler diagHandler(sourceMgr, mod.getContext(),
llvm::errs());
if (failed(self.run(mod.getOperation())))
throw std::runtime_error("PassManager::run failed");
});
Expand Down

0 comments on commit d5e06fe

Please sign in to comment.