Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenMP][MLIR] Add omp.distribute op to the OMP dialect #67720

Merged
merged 2 commits into from
Jan 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
Original file line number Diff line number Diff line change
Expand Up @@ -636,6 +636,60 @@ def YieldOp : OpenMP_Op<"yield",
let assemblyFormat = [{ ( `(` $results^ `:` type($results) `)` )? attr-dict}];
}

//===----------------------------------------------------------------------===//
// Distribute construct [2.9.4.1]
//===----------------------------------------------------------------------===//
def DistributeOp : OpenMP_Op<"distribute", [AttrSizedOperandSegments,
MemoryEffects<[MemWrite]>]> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shraiysh could you comment on the MemWrite effect here. I believe you previously worked on adding RecursiveMemoryEffects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if @shraiysh is working on OpenMP anymore. There is a bit of a nuance with respect to the memory effects. The memory effects apply to operations inside the distribute region, and to containing ops, but not ops on the same level. If CSE or some other pass decides to hoist things outside the inner region that would be illegal (so in some sense it does apply moved operations at the same level). Similarly it would be illegal to hoist the omp.distribute outside a omp.teams, It is safest to mark the op with MemoryEffects<[MemWrite]>.

let summary = "distribute construct";
let description = [{
The distribute construct specifies that the iterations of one or more loops
(optionally specified using collapse clause) will be executed by the
initial teams in the context of their implicit tasks. The loops that the
distribute op is associated with starts with the outermost loop enclosed by
the distribute op region and going down the loop nest toward the innermost
loop. The iterations are distributed across the initial threads of all
initial teams that execute the teams region to which the distribute region
binds.

The distribute loop construct specifies that the iterations of the loop(s)
will be executed in parallel by threads in the current context. These
iterations are spread across threads that already exist in the enclosing
region. The lower and upper bounds specify a half-open range: the
range includes the lower bound but does not include the upper bound. If the
`inclusive` attribute is specified then the upper bound is also included.

The `dist_schedule_static` attribute specifies the schedule for this
loop, determining how the loop is distributed across the parallel threads.
The optional `schedule_chunk` associated with this determines further
controls this distribution.

// TODO: private_var, firstprivate_var, lastprivate_var, collapse
}];
let arguments = (ins
UnitAttr:$dist_schedule_static,
Optional<IntLikeType>:$chunk_size,
Variadic<AnyType>:$allocate_vars,
Variadic<AnyType>:$allocators_vars,
OptionalAttr<OrderKindAttr>:$order_val);

let regions = (region AnyRegion:$region);

let assemblyFormat = [{
oilist(`dist_schedule_static` $dist_schedule_static
|`chunk_size` `(` $chunk_size `:` type($chunk_size) `)`
|`order` `(` custom<ClauseAttr>($order_val) `)`
|`allocate` `(`
custom<AllocateAndAllocator>(
$allocate_vars, type($allocate_vars),
$allocators_vars, type($allocators_vars)
) `)`
) $region attr-dict
}];

let hasVerifier = 1;
}

//===----------------------------------------------------------------------===//
// 2.10.1 task Construct
//===----------------------------------------------------------------------===//
Expand Down
16 changes: 16 additions & 0 deletions mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1053,6 +1053,22 @@ LogicalResult SimdLoopOp::verify() {
return success();
}

//===----------------------------------------------------------------------===//
// Verifier for Distribute construct [2.9.4.1]
//===----------------------------------------------------------------------===//

LogicalResult DistributeOp::verify() {
if (this->getChunkSize() && !this->getDistScheduleStatic())
return emitOpError() << "chunk size set without "
"dist_schedule_static being present";

if (getAllocateVars().size() != getAllocatorsVars().size())
return emitError(
"expected equal sizes for allocate and allocator variables");
jsjodin marked this conversation as resolved.
Show resolved Hide resolved

return success();
}

//===----------------------------------------------------------------------===//
// ReductionOp
//===----------------------------------------------------------------------===//
Expand Down
9 changes: 9 additions & 0 deletions mlir/test/Dialect/OpenMP/invalid.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -1657,3 +1657,12 @@ func.func @omp_target_exit_data(%map1: memref<?xi32>) {
}

llvm.mlir.global internal @_QFsubEx() : i32

// -----

func.func @omp_distribute(%data_var : memref<i32>) -> () {
// expected-error @below {{expected equal sizes for allocate and allocator variables}}
"omp.distribute"(%data_var) <{operandSegmentSizes = array<i32: 0, 1, 0>}> ({
"omp.terminator"() : () -> ()
}) : (memref<i32>) -> ()
}
30 changes: 30 additions & 0 deletions mlir/test/Dialect/OpenMP/ops.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -479,6 +479,36 @@ func.func @omp_simdloop_pretty_multiple(%lb1 : index, %ub1 : index, %step1 : ind
return
}

// CHECK-LABEL: omp_distribute
func.func @omp_distribute(%chunk_size : i32, %data_var : memref<i32>) -> () {
// CHECK: omp.distribute
"omp.distribute" () ({
omp.terminator
}) {} : () -> ()
// CHECK: omp.distribute
omp.distribute {
omp.terminator
}
// CHECK: omp.distribute dist_schedule_static
omp.distribute dist_schedule_static {
omp.terminator
}
// CHECK: omp.distribute dist_schedule_static chunk_size(%{{.+}} : i32)
omp.distribute dist_schedule_static chunk_size(%chunk_size : i32) {
omp.terminator
}
// CHECK: omp.distribute order(concurrent)
omp.distribute order(concurrent) {
omp.terminator
}
// CHECK: omp.distribute allocate(%{{.+}} : memref<i32> -> %{{.+}} : memref<i32>)
omp.distribute allocate(%data_var : memref<i32> -> %data_var : memref<i32>) {
omp.terminator
}
return
}


// CHECK-LABEL: omp_target
func.func @omp_target(%if_cond : i1, %device : si32, %num_threads : i32, %map1: memref<?xi32>, %map2: memref<?xi32>) -> () {

Expand Down