Skip to content

Commit

Permalink
[flang] handle alloca outside of entry blocks in MemoryAllocation (ll…
Browse files Browse the repository at this point in the history
…vm#98457)

This patch generalizes the MemoryAllocation pass (alloca -> heap) to
handle fir.alloca regardless of their postion in the IR. Currently, it
only dealt with fir.alloca in function entry blocks. The logic is placed
in a utility that can be used to replace alloca in an operation on
demand to whatever kind of allocation the utility user wants via
callbacks (allocmem, or custom runtime calls to instrument the code...).

To do so, a concept of ownership, that was already implied a bit and
used in passes like stack-reclaim, is formalized. Any operation with the
LoopLikeInterface, AutomaticAllocationScope, or IsolatedFromAbove owns
the alloca directly nested inside its regions, and they must not be used
after the operation.

The pass then looks for the exit points of region with such interface,
and use that to insert deallocation. If dominance is not proved, the
pass fallbacks to storing the new address into a C pointer variable
created in the entry of the owning region which allows inserting
deallocation as needed, included near the alloca itself to avoid leaks
when the alloca is executed multiple times due to block CFGs loops.

This should fix llvm#88344.

In a next step, I will try to refactor lowering a bit to introduce
lifetime operation for alloca so that the deallocation points can be
inserted as soon as possible.
  • Loading branch information
jeanPerier authored and sgundapa committed Jul 23, 2024
1 parent 2f8de7c commit 00486d9
Show file tree
Hide file tree
Showing 9 changed files with 610 additions and 108 deletions.
18 changes: 13 additions & 5 deletions flang/include/flang/Optimizer/Builder/FIRBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,13 @@ class ExtendedValue;
class MutableBoxValue;
class BoxValue;

/// Get the integer type with a pointer size.
inline mlir::Type getIntPtrType(mlir::OpBuilder &builder) {
// TODO: Delay the need of such type until codegen or find a way to use
// llvm::DataLayout::getPointerSizeInBits here.
return builder.getI64Type();
}

//===----------------------------------------------------------------------===//
// FirOpBuilder
//===----------------------------------------------------------------------===//
Expand Down Expand Up @@ -143,11 +150,7 @@ class FirOpBuilder : public mlir::OpBuilder, public mlir::OpBuilder::Listener {

/// Get the integer type whose bit width corresponds to the width of pointer
/// types, or is bigger.
mlir::Type getIntPtrType() {
// TODO: Delay the need of such type until codegen or find a way to use
// llvm::DataLayout::getPointerSizeInBits here.
return getI64Type();
}
mlir::Type getIntPtrType() { return fir::getIntPtrType(*this); }

/// Wrap `str` to a SymbolRefAttr.
mlir::SymbolRefAttr getSymbolRefAttr(llvm::StringRef str) {
Expand Down Expand Up @@ -712,6 +715,11 @@ fir::BoxValue createBoxValue(fir::FirOpBuilder &builder, mlir::Location loc,
mlir::Value createNullBoxProc(fir::FirOpBuilder &builder, mlir::Location loc,
mlir::Type boxType);

/// Convert a value to a new type. Return the value directly if it has the right
/// type.
mlir::Value createConvert(mlir::OpBuilder &, mlir::Location, mlir::Type,
mlir::Value);

/// Set internal linkage attribute on a function.
void setInternalLinkage(mlir::func::FuncOp);

Expand Down
13 changes: 13 additions & 0 deletions flang/include/flang/Optimizer/Dialect/FIROps.td
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,13 @@ def fir_AllocaOp : fir_Op<"alloca", [AttrSizedOperandSegments,
Indeed, a user would likely expect a good Fortran compiler to perform such
an optimization.

Stack allocations have a maximum lifetime concept: their uses must not
exceed the lifetime of the closest parent operation with the
AutomaticAllocationScope trait, IsIsolatedFromAbove trait, or
LoopLikeOpInterface trait. This restriction is meant to ease the
insertion of stack save and restore operations, and to ease the conversion
of stack allocation into heap allocation.

Until Fortran 2018, procedures defaulted to non-recursive. A legal
implementation could therefore convert stack allocations to global
allocations. Such a conversion effectively adds the SAVE attribute to all
Expand Down Expand Up @@ -183,11 +190,17 @@ def fir_AllocaOp : fir_Op<"alloca", [AttrSizedOperandSegments,
mlir::Type getAllocatedType();
bool hasLenParams() { return !getTypeparams().empty(); }
bool hasShapeOperands() { return !getShape().empty(); }
bool isDynamic() {return hasLenParams() || hasShapeOperands();}
unsigned numLenParams() { return getTypeparams().size(); }
operand_range getLenParams() { return getTypeparams(); }
unsigned numShapeOperands() { return getShape().size(); }
operand_range getShapeOperands() { return getShape(); }
static mlir::Type getRefTy(mlir::Type ty);
/// Is this an operation that owns the alloca directly made in its region?
static bool ownsNestedAlloca(mlir::Operation* op);
/// Get the parent region that owns this alloca. Nullptr if none can be
/// identified.
mlir::Region* getOwnerRegion();
}];
}

Expand Down
62 changes: 62 additions & 0 deletions flang/include/flang/Optimizer/Transforms/MemoryUtils.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
//===-- Optimizer/Transforms/MemoryUtils.h ----------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// Coding style: https://mlir.llvm.org/getting_started/DeveloperGuide/
//
//===----------------------------------------------------------------------===//
//
// This file defines a utility to replace fir.alloca by dynamic allocation and
// deallocation. The exact kind of dynamic allocation is left to be defined by
// the utility user via callbacks (could be fir.allocmem or custom runtime
// calls).
//
//===----------------------------------------------------------------------===//

#ifndef FORTRAN_OPTIMIZER_TRANSFORMS_MEMORYUTILS_H
#define FORTRAN_OPTIMIZER_TRANSFORMS_MEMORYUTILS_H

#include "flang/Optimizer/Dialect/FIROps.h"

namespace mlir {
class RewriterBase;
}

namespace fir {

/// Type of callbacks that indicate if a given fir.alloca must be
/// rewritten.
using MustRewriteCallBack = llvm::function_ref<bool(fir::AllocaOp)>;

/// Type of callbacks that produce the replacement for a given fir.alloca.
/// It is provided extra information about the dominance of the deallocation
/// points that have been identified, and may refuse to replace the alloca,
/// even if the MustRewriteCallBack previously returned true, in which case
/// it should return a null value.
/// The callback should not delete the alloca, the utility will do it.
using AllocaRewriterCallBack = llvm::function_ref<mlir::Value(
mlir::OpBuilder &, fir::AllocaOp, bool allocaDominatesDeallocLocations)>;
/// Type of callbacks that must generate deallocation of storage obtained via
/// AllocaRewriterCallBack calls.
using DeallocCallBack =
llvm::function_ref<void(mlir::Location, mlir::OpBuilder &, mlir::Value)>;

/// Utility to replace fir.alloca by dynamic allocations inside \p parentOp.
/// \p MustRewriteCallBack lets the user control which fir.alloca should be
/// replaced. \p AllocaRewriterCallBack lets the user define how the new memory
/// should be allocated. \p DeallocCallBack lets the user decide how the memory
/// should be deallocated. The boolean result indicates if the utility succeeded
/// to replace all fir.alloca as requested by the user. Causes of failures are
/// the presence of unregistered operations, or OpenMP/ACC recipe operations
/// that return memory allocated inside their region.
bool replaceAllocas(mlir::RewriterBase &rewriter, mlir::Operation *parentOp,
MustRewriteCallBack, AllocaRewriterCallBack,
DeallocCallBack);

} // namespace fir

#endif // FORTRAN_OPTIMIZER_TRANSFORMS_MEMORYUTILS_H
12 changes: 9 additions & 3 deletions flang/lib/Optimizer/Builder/FIRBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -455,15 +455,21 @@ mlir::Value fir::FirOpBuilder::convertWithSemantics(
return createConvert(loc, toTy, val);
}

mlir::Value fir::FirOpBuilder::createConvert(mlir::Location loc,
mlir::Type toTy, mlir::Value val) {
mlir::Value fir::factory::createConvert(mlir::OpBuilder &builder,
mlir::Location loc, mlir::Type toTy,
mlir::Value val) {
if (val.getType() != toTy) {
assert(!fir::isa_derived(toTy));
return create<fir::ConvertOp>(loc, toTy, val);
return builder.create<fir::ConvertOp>(loc, toTy, val);
}
return val;
}

mlir::Value fir::FirOpBuilder::createConvert(mlir::Location loc,
mlir::Type toTy, mlir::Value val) {
return fir::factory::createConvert(*this, loc, toTy, val);
}

void fir::FirOpBuilder::createStoreWithConvert(mlir::Location loc,
mlir::Value val,
mlir::Value addr) {
Expand Down
21 changes: 21 additions & 0 deletions flang/lib/Optimizer/Dialect/FIROps.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,27 @@ llvm::LogicalResult fir::AllocaOp::verify() {
return mlir::success();
}

bool fir::AllocaOp::ownsNestedAlloca(mlir::Operation *op) {
return op->hasTrait<mlir::OpTrait::IsIsolatedFromAbove>() ||
op->hasTrait<mlir::OpTrait::AutomaticAllocationScope>() ||
mlir::isa<mlir::LoopLikeOpInterface>(*op);
}

mlir::Region *fir::AllocaOp::getOwnerRegion() {
mlir::Operation *currentOp = getOperation();
while (mlir::Operation *parentOp = currentOp->getParentOp()) {
// If the operation was not registered, inquiries about its traits will be
// incorrect and it is not possible to reason about the operation. This
// should not happen in a normal Fortran compilation flow, but be foolproof.
if (!parentOp->isRegistered())
return nullptr;
if (fir::AllocaOp::ownsNestedAlloca(parentOp))
return currentOp->getParentRegion();
currentOp = parentOp;
}
return nullptr;
}

//===----------------------------------------------------------------------===//
// AllocMemOp
//===----------------------------------------------------------------------===//
Expand Down
1 change: 1 addition & 0 deletions flang/lib/Optimizer/Transforms/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ add_flang_library(FIRTransforms
ControlFlowConverter.cpp
ArrayValueCopy.cpp
ExternalNameConversion.cpp
MemoryUtils.cpp
MemoryAllocation.cpp
StackArrays.cpp
MemRefDataFlowOpt.cpp
Expand Down
143 changes: 43 additions & 100 deletions flang/lib/Optimizer/Transforms/MemoryAllocation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
#include "flang/Optimizer/Dialect/FIRDialect.h"
#include "flang/Optimizer/Dialect/FIROps.h"
#include "flang/Optimizer/Dialect/FIRType.h"
#include "flang/Optimizer/Transforms/MemoryUtils.h"
#include "flang/Optimizer/Transforms/Passes.h"
#include "mlir/Dialect/Func/IR/FuncOps.h"
#include "mlir/IR/Diagnostics.h"
Expand All @@ -27,50 +28,18 @@ namespace fir {
// Number of elements in an array does not determine where it is allocated.
static constexpr std::size_t unlimitedArraySize = ~static_cast<std::size_t>(0);

namespace {
class ReturnAnalysis {
public:
MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID(ReturnAnalysis)

ReturnAnalysis(mlir::Operation *op) {
if (auto func = mlir::dyn_cast<mlir::func::FuncOp>(op))
for (mlir::Block &block : func)
for (mlir::Operation &i : block)
if (mlir::isa<mlir::func::ReturnOp>(i)) {
returnMap[op].push_back(&i);
break;
}
}

llvm::SmallVector<mlir::Operation *> getReturns(mlir::Operation *func) const {
auto iter = returnMap.find(func);
if (iter != returnMap.end())
return iter->second;
return {};
}

private:
llvm::DenseMap<mlir::Operation *, llvm::SmallVector<mlir::Operation *>>
returnMap;
};
} // namespace

/// Return `true` if this allocation is to remain on the stack (`fir.alloca`).
/// Otherwise the allocation should be moved to the heap (`fir.allocmem`).
static inline bool
keepStackAllocation(fir::AllocaOp alloca, mlir::Block *entry,
keepStackAllocation(fir::AllocaOp alloca,
const fir::MemoryAllocationOptOptions &options) {
// Limitation: only arrays allocated on the stack in the entry block are
// considered for now.
// TODO: Generalize the algorithm and placement of the freemem nodes.
if (alloca->getBlock() != entry)
return true;
// Move all arrays and character with runtime determined size to the heap.
if (options.dynamicArrayOnHeap && alloca.isDynamic())
return false;
// TODO: use data layout to reason in terms of byte size to cover all "big"
// entities, which may be scalar derived types.
if (auto seqTy = mlir::dyn_cast<fir::SequenceType>(alloca.getInType())) {
if (fir::hasDynamicSize(seqTy)) {
// Move all arrays with runtime determined size to the heap.
if (options.dynamicArrayOnHeap)
return false;
} else {
if (!fir::hasDynamicSize(seqTy)) {
std::int64_t numberOfElements = 1;
for (std::int64_t i : seqTy.getShape()) {
numberOfElements *= i;
Expand All @@ -82,58 +51,37 @@ keepStackAllocation(fir::AllocaOp alloca, mlir::Block *entry,
// the heap.
if (static_cast<std::size_t>(numberOfElements) >
options.maxStackArraySize) {
LLVM_DEBUG(llvm::dbgs()
<< "memory allocation opt: found " << alloca << '\n');
return false;
}
}
}
return true;
}

namespace {
class AllocaOpConversion : public mlir::OpRewritePattern<fir::AllocaOp> {
public:
using OpRewritePattern::OpRewritePattern;

AllocaOpConversion(mlir::MLIRContext *ctx,
llvm::ArrayRef<mlir::Operation *> rets)
: OpRewritePattern(ctx), returnOps(rets) {}

llvm::LogicalResult
matchAndRewrite(fir::AllocaOp alloca,
mlir::PatternRewriter &rewriter) const override {
auto loc = alloca.getLoc();
mlir::Type varTy = alloca.getInType();
auto unpackName =
[](std::optional<llvm::StringRef> opt) -> llvm::StringRef {
if (opt)
return *opt;
return {};
};
auto uniqName = unpackName(alloca.getUniqName());
auto bindcName = unpackName(alloca.getBindcName());
auto heap = rewriter.create<fir::AllocMemOp>(
loc, varTy, uniqName, bindcName, alloca.getTypeparams(),
alloca.getShape());
auto insPt = rewriter.saveInsertionPoint();
for (mlir::Operation *retOp : returnOps) {
rewriter.setInsertionPoint(retOp);
[[maybe_unused]] auto free = rewriter.create<fir::FreeMemOp>(loc, heap);
LLVM_DEBUG(llvm::dbgs() << "memory allocation opt: add free " << free
<< " for " << heap << '\n');
}
rewriter.restoreInsertionPoint(insPt);
rewriter.replaceOpWithNewOp<fir::ConvertOp>(
alloca, fir::ReferenceType::get(varTy), heap);
LLVM_DEBUG(llvm::dbgs() << "memory allocation opt: replaced " << alloca
<< " with " << heap << '\n');
return mlir::success();
}
static mlir::Value genAllocmem(mlir::OpBuilder &builder, fir::AllocaOp alloca,
bool deallocPointsDominateAlloc) {
mlir::Type varTy = alloca.getInType();
auto unpackName = [](std::optional<llvm::StringRef> opt) -> llvm::StringRef {
if (opt)
return *opt;
return {};
};
llvm::StringRef uniqName = unpackName(alloca.getUniqName());
llvm::StringRef bindcName = unpackName(alloca.getBindcName());
auto heap = builder.create<fir::AllocMemOp>(alloca.getLoc(), varTy, uniqName,
bindcName, alloca.getTypeparams(),
alloca.getShape());
LLVM_DEBUG(llvm::dbgs() << "memory allocation opt: replaced " << alloca
<< " with " << heap << '\n');
return heap;
}

private:
llvm::ArrayRef<mlir::Operation *> returnOps;
};
static void genFreemem(mlir::Location loc, mlir::OpBuilder &builder,
mlir::Value allocmem) {
[[maybe_unused]] auto free = builder.create<fir::FreeMemOp>(loc, allocmem);
LLVM_DEBUG(llvm::dbgs() << "memory allocation opt: add free " << free
<< " for " << allocmem << '\n');
}

/// This pass can reclassify memory allocations (fir.alloca, fir.allocmem) based
/// on heuristics and settings. The intention is to allow better performance and
Expand All @@ -144,6 +92,7 @@ class AllocaOpConversion : public mlir::OpRewritePattern<fir::AllocaOp> {
/// make it a heap allocation.
/// 2. If a stack allocation is an array with a runtime evaluated size make
/// it a heap allocation.
namespace {
class MemoryAllocationOpt
: public fir::impl::MemoryAllocationOptBase<MemoryAllocationOpt> {
public:
Expand Down Expand Up @@ -184,23 +133,17 @@ class MemoryAllocationOpt
// If func is a declaration, skip it.
if (func.empty())
return;

const auto &analysis = getAnalysis<ReturnAnalysis>();

target.addLegalDialect<fir::FIROpsDialect, mlir::arith::ArithDialect,
mlir::func::FuncDialect>();
target.addDynamicallyLegalOp<fir::AllocaOp>([&](fir::AllocaOp alloca) {
return keepStackAllocation(alloca, &func.front(), options);
});

llvm::SmallVector<mlir::Operation *> returnOps = analysis.getReturns(func);
patterns.insert<AllocaOpConversion>(context, returnOps);
if (mlir::failed(
mlir::applyPartialConversion(func, target, std::move(patterns)))) {
mlir::emitError(func.getLoc(),
"error in memory allocation optimization\n");
signalPassFailure();
}
auto tryReplacing = [&](fir::AllocaOp alloca) {
bool res = !keepStackAllocation(alloca, options);
if (res) {
LLVM_DEBUG(llvm::dbgs()
<< "memory allocation opt: found " << alloca << '\n');
}
return res;
};
mlir::IRRewriter rewriter(context);
fir::replaceAllocas(rewriter, func.getOperation(), tryReplacing,
genAllocmem, genFreemem);
}

private:
Expand Down
Loading

0 comments on commit 00486d9

Please sign in to comment.