Skip to content

Commit

Permalink
[mlir][vector] Relax the requirements on broadcast dims (llvm#99341)
Browse files Browse the repository at this point in the history
NOTE: This is a follow-up for llvm#97049 in which the `in_bounds` attribute
was made mandatory.

This PR updates the semantics of the `in_bounds` attribute so that
broadcast dimensions are no longer required to be "in bounds".
Specifically, these xfer_read/xfer_write Ops become valid after this
change:

```mlir
  %read = vector.transfer_read %A[%base1, %base2], %pad
      {in_bounds = [false], permutation_map = affine_map<(d0, d1) -> (0)>}
      {permutation_map = affine_map<(d0, d1) -> (0)>}
      : memref<?x?xf32>, vector<9xf32>

  vector.transfer_write %vec, %A[%base1, %base2],
      {in_bounds = [false], permutation_map = affine_map<(d0, d1) -> (0)>}
      {permutation_map = affine_map<(d0, d1) -> (0)>}
      : vector<9xf32>, memref<?x?xf32>
```

Note that the value `false` merely means "may run out-of-bounds", i.e.,
the corresponding access can still be "in bounds". In fact, the folder
for xfer Ops is also updated (*) and will update the attribute value
corresponding to broadcast dims to `true` if all non-broadcast dims
are marked as "in bounds". 

Note that this PR doesn't change any of the lowerings. The changes in
"SuperVectorize.cpp", "Vectorization.cpp" and "AffineMap.cpp" are simple
reverts of recent changes in llvm#97049. Those were only meant to facilitate
making `in_bounds` mandatory and to work around the extra requirements
for broadcast dims (those requirements ere removed in this PR). All
changes in tests are also reverts of changes from llvm#97049.

For context, here's a PR in which "broadcast" dims where forced to
always be "in-bounds":
  * https://reviews.llvm.org/D102566

(*) See `foldTransferInBoundsAttribute`.
  • Loading branch information
banach-space authored and xgupta committed Oct 4, 2024
1 parent d4d1113 commit da3aa03
Show file tree
Hide file tree
Showing 18 changed files with 81 additions and 81 deletions.
24 changes: 12 additions & 12 deletions mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
Original file line number Diff line number Diff line change
Expand Up @@ -1290,12 +1290,12 @@ def Vector_TransferReadOp :
specifies if the transfer is guaranteed to be within the source bounds. If
set to "false", accesses (including the starting point) may run
out-of-bounds along the respective vector dimension as the index increases.
Non-vector and broadcast dimensions *must* always be in-bounds. The
`in_bounds` array length has to be equal to the vector rank. This attribute
has a default value: `false` (i.e. "out-of-bounds"). When skipped in the
textual IR, the default value is assumed. Similarly, the OP printer will
omit this attribute when all dimensions are out-of-bounds (i.e. the default
value is used).
Non-vector dimensions *must* always be in-bounds. The `in_bounds` array
length has to be equal to the vector rank. This attribute has a default
value: `false` (i.e. "out-of-bounds"). When skipped in the textual IR, the
default value is assumed. Similarly, the OP printer will omit this
attribute when all dimensions are out-of-bounds (i.e. the default value is
used).

A `vector.transfer_read` can be lowered to a simple load if all dimensions
are specified to be within bounds and no `mask` was specified.
Expand Down Expand Up @@ -1535,12 +1535,12 @@ def Vector_TransferWriteOp :
specifies if the transfer is guaranteed to be within the source bounds. If
set to "false", accesses (including the starting point) may run
out-of-bounds along the respective vector dimension as the index increases.
Non-vector and broadcast dimensions *must* always be in-bounds. The
`in_bounds` array length has to be equal to the vector rank. This attribute
has a default value: `false` (i.e. "out-of-bounds"). When skipped in the
textual IR, the default value is assumed. Similarly, the OP printer will
omit this attribute when all dimensions are out-of-bounds (i.e. the default
value is used).
Non-vector dimensions *must* always be in-bounds. The `in_bounds` array
length has to be equal to the vector rank. This attribute has a default
value: `false` (i.e. "out-of-bounds"). When skipped in the textual IR, the
default value is assumed. Similarly, the OP printer will omit this
attribute when all dimensions are out-of-bounds (i.e. the default value is
used).

A `vector.transfer_write` can be lowered to a simple store if all
dimensions are specified to be within bounds and no `mask` was specified.
Expand Down
7 changes: 2 additions & 5 deletions mlir/include/mlir/Interfaces/VectorInterfaces.td
Original file line number Diff line number Diff line change
Expand Up @@ -234,12 +234,9 @@ def VectorTransferOpInterface : OpInterface<"VectorTransferOpInterface"> {
return constExpr && constExpr.getValue() == 0;
}

/// Return "true" if the vector transfer dimension `dim` is in-bounds. Also
/// return "true" if the dimension is a broadcast dimension. Return "false"
/// otherwise.
/// Return "true" if the vector transfer dimension `dim` is in-bounds.
/// Return "false" otherwise.
bool isDimInBounds(unsigned dim) {
if ($_op.isBroadcastDim(dim))
return true;
auto inBounds = $_op.getInBounds();
return ::llvm::cast<::mlir::BoolAttr>(inBounds[dim]).getValue();
}
Expand Down
13 changes: 1 addition & 12 deletions mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1223,19 +1223,8 @@ static Operation *vectorizeAffineLoad(AffineLoadOp loadOp,
LLVM_DEBUG(dbgs() << "\n[early-vect]+++++ permutationMap: ");
LLVM_DEBUG(permutationMap.print(dbgs()));

// Make sure that the in_bounds attribute corresponding to a broadcast dim
// is set to `true` - that's required by the xfer Op.
// FIXME: We're not veryfying whether the corresponding access is in bounds.
// TODO: Use masking instead.
SmallVector<unsigned> broadcastedDims = permutationMap.getBroadcastDims();
SmallVector<bool> inBounds(vectorType.getRank(), false);

for (auto idx : broadcastedDims)
inBounds[idx] = true;

auto transfer = state.builder.create<vector::TransferReadOp>(
loadOp.getLoc(), vectorType, loadOp.getMemRef(), indices, permutationMap,
inBounds);
loadOp.getLoc(), vectorType, loadOp.getMemRef(), indices, permutationMap);

// Register replacement for future uses in the scope.
state.registerOpVectorReplacement(loadOp, transfer);
Expand Down
11 changes: 1 addition & 10 deletions mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1380,17 +1380,8 @@ vectorizeAsLinalgGeneric(RewriterBase &rewriter, VectorizationState &state,

SmallVector<Value> indices(linalgOp.getShape(opOperand).size(), zero);

// Make sure that the in_bounds attribute corresponding to a broadcast dim
// is `true`
SmallVector<unsigned> broadcastedDims = readMap.getBroadcastDims();
SmallVector<bool> inBounds(readType.getRank(), false);

for (auto idx : broadcastedDims)
inBounds[idx] = true;

Operation *read = rewriter.create<vector::TransferReadOp>(
loc, readType, opOperand->get(), indices, readMap,
ArrayRef<bool>(inBounds));
loc, readType, opOperand->get(), indices, readMap);
read = state.maskOperation(rewriter, read, linalgOp, indexingMap);
Value readValue = read->getResult(0);

Expand Down
37 changes: 27 additions & 10 deletions mlir/lib/Dialect/Vector/IR/VectorOps.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3947,10 +3947,6 @@ verifyTransferOp(VectorTransferOpInterface op, ShapedType shapedType,
"as permutation_map results: ")
<< AffineMapAttr::get(permutationMap)
<< " vs inBounds of size: " << inBounds.size();
for (unsigned int i = 0, e = permutationMap.getNumResults(); i < e; ++i)
if (isa<AffineConstantExpr>(permutationMap.getResult(i)) &&
!llvm::cast<BoolAttr>(inBounds.getValue()[i]).getValue())
return op->emitOpError("requires broadcast dimensions to be in-bounds");

return success();
}
Expand Down Expand Up @@ -4138,22 +4134,43 @@ static LogicalResult foldTransferInBoundsAttribute(TransferOp op) {
bool changed = false;
SmallVector<bool, 4> newInBounds;
newInBounds.reserve(op.getTransferRank());
// Idxs of non-bcast dims - used when analysing bcast dims.
SmallVector<unsigned> nonBcastDims;

// 1. Process non-broadcast dims
for (unsigned i = 0; i < op.getTransferRank(); ++i) {
// Already marked as in-bounds, nothing to see here.
// 1.1. Already marked as in-bounds, nothing to see here.
if (op.isDimInBounds(i)) {
newInBounds.push_back(true);
continue;
}
// Currently out-of-bounds, check whether we can statically determine it is
// inBounds.
// 1.2. Currently out-of-bounds, check whether we can statically determine
// it is inBounds.
bool inBounds = false;
auto dimExpr = dyn_cast<AffineDimExpr>(permutationMap.getResult(i));
assert(dimExpr && "Broadcast dims must be in-bounds");
auto inBounds =
isInBounds(op, /*resultIdx=*/i, /*indicesIdx=*/dimExpr.getPosition());
if (dimExpr) {
inBounds = isInBounds(op, /*resultIdx=*/i,
/*indicesIdx=*/dimExpr.getPosition());
nonBcastDims.push_back(i);
}

newInBounds.push_back(inBounds);
// We commit the pattern if it is "more inbounds".
changed |= inBounds;
}

// 2. Handle broadcast dims
// If all non-broadcast dims are "in bounds", then all bcast dims should be
// "in bounds" as well.
bool allNonBcastDimsInBounds = llvm::all_of(
nonBcastDims, [&newInBounds](unsigned idx) { return newInBounds[idx]; });
if (allNonBcastDimsInBounds) {
for (size_t idx : permutationMap.getBroadcastDims()) {
changed |= !newInBounds[idx];
newInBounds[idx] = true;
}
}

if (!changed)
return failure();
// OpBuilder is only used as a helper to build an I64ArrayAttr.
Expand Down
6 changes: 3 additions & 3 deletions mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ func.func @materialize_read(%M: index, %N: index, %O: index, %P: index) {
affine.for %i1 = 0 to %N {
affine.for %i2 = 0 to %O {
affine.for %i3 = 0 to %P step 5 {
%f = vector.transfer_read %A[%i0, %i1, %i2, %i3], %f0 {in_bounds = [false, true, false], permutation_map = affine_map<(d0, d1, d2, d3) -> (d3, 0, d0)>} : memref<?x?x?x?xf32>, vector<5x4x3xf32>
%f = vector.transfer_read %A[%i0, %i1, %i2, %i3], %f0 {permutation_map = affine_map<(d0, d1, d2, d3) -> (d3, 0, d0)>} : memref<?x?x?x?xf32>, vector<5x4x3xf32>
// Add a dummy use to prevent dead code elimination from removing
// transfer read ops.
"dummy_use"(%f) : (vector<5x4x3xf32>) -> ()
Expand Down Expand Up @@ -507,7 +507,7 @@ func.func @transfer_read_with_tensor(%arg: tensor<f32>) -> vector<1xf32> {
// CHECK-NEXT: %[[RESULT:.*]] = vector.broadcast %[[EXTRACTED]] : f32 to vector<1xf32>
// CHECK-NEXT: return %[[RESULT]] : vector<1xf32>
%f0 = arith.constant 0.0 : f32
%0 = vector.transfer_read %arg[], %f0 {in_bounds = [true], permutation_map = affine_map<()->(0)>} :
%0 = vector.transfer_read %arg[], %f0 {permutation_map = affine_map<()->(0)>} :
tensor<f32>, vector<1xf32>
return %0: vector<1xf32>
}
Expand Down Expand Up @@ -746,7 +746,7 @@ func.func @cannot_lower_transfer_read_with_leading_scalable(%arg0: memref<?x4xf3
func.func @does_not_crash_on_unpack_one_dim(%subview: memref<1x1x1x1xi32>, %mask: vector<1x1xi1>) -> vector<1x1x1x1xi32> {
%c0 = arith.constant 0 : index
%c0_i32 = arith.constant 0 : i32
%3 = vector.transfer_read %subview[%c0, %c0, %c0, %c0], %c0_i32, %mask {in_bounds = [false, true, true, false], permutation_map = #map1}
%3 = vector.transfer_read %subview[%c0, %c0, %c0, %c0], %c0_i32, %mask {permutation_map = #map1}
: memref<1x1x1x1xi32>, vector<1x1x1x1xi32>
return %3 : vector<1x1x1x1xi32>
}
Expand Down
6 changes: 3 additions & 3 deletions mlir/test/Dialect/Affine/SuperVectorize/vectorize_1d.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ func.func @vec1d_1(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
// CHECK-NEXT: %{{.*}} = affine.apply #[[$map_id1]](%[[C0]])
// CHECK-NEXT: %{{.*}} = affine.apply #[[$map_id1]](%[[C0]])
// CHECK-NEXT: %{{.*}} = arith.constant 0.0{{.*}}: f32
// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true], permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
affine.for %i0 = 0 to %M { // vectorized due to scalar -> vector
%a0 = affine.load %A[%c0, %c0] : memref<?x?xf32>
}
Expand Down Expand Up @@ -425,7 +425,7 @@ func.func @vec_rejected_8(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
// CHECK: %{{.*}} = affine.apply #[[$map_id1]](%{{.*}})
// CHECK: %{{.*}} = affine.apply #[[$map_id1]](%{{.*}})
// CHECK: %{{.*}} = arith.constant 0.0{{.*}}: f32
// CHECK: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true], permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
// CHECK: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
affine.for %i17 = 0 to %M { // not vectorized, the 1-D pattern that matched %{{.*}} in DFS post-order prevents vectorizing %{{.*}}
affine.for %i18 = 0 to %M { // vectorized due to scalar -> vector
%a18 = affine.load %A[%c0, %c0] : memref<?x?xf32>
Expand Down Expand Up @@ -459,7 +459,7 @@ func.func @vec_rejected_9(%A : memref<?x?xf32>, %B : memref<?x?x?xf32>) {
// CHECK: %{{.*}} = affine.apply #[[$map_id1]](%{{.*}})
// CHECK-NEXT: %{{.*}} = affine.apply #[[$map_id1]](%{{.*}})
// CHECK-NEXT: %{{.*}} = arith.constant 0.0{{.*}}: f32
// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {in_bounds = [true], permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
// CHECK-NEXT: {{.*}} = vector.transfer_read %{{.*}}[%{{.*}}, %{{.*}}], %{{.*}} {permutation_map = #[[$map_proj_d0d1_0]]} : memref<?x?xf32>, vector<128xf32>
affine.for %i17 = 0 to %M { // not vectorized, the 1-D pattern that matched %i18 in DFS post-order prevents vectorizing %{{.*}}
affine.for %i18 = 0 to %M { // vectorized due to scalar -> vector
%a18 = affine.load %A[%c0, %c0] : memref<?x?xf32>
Expand Down
4 changes: 2 additions & 2 deletions mlir/test/Dialect/Affine/SuperVectorize/vectorize_2d.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -123,8 +123,8 @@ func.func @vectorize_matmul(%arg0: memref<?x?xf32>, %arg1: memref<?x?xf32>, %arg
// VECT: affine.for %[[I2:.*]] = #[[$map_id1]](%[[C0]]) to #[[$map_id1]](%[[M]]) step 4 {
// VECT-NEXT: affine.for %[[I3:.*]] = #[[$map_id1]](%[[C0]]) to #[[$map_id1]](%[[N]]) step 8 {
// VECT-NEXT: affine.for %[[I4:.*]] = #[[$map_id1]](%[[C0]]) to #[[$map_id1]](%[[K]]) {
// VECT: %[[A:.*]] = vector.transfer_read %{{.*}}[%[[I4]], %[[I3]]], %{{.*}} {in_bounds = [true, false], permutation_map = #[[$map_proj_d0d1_zerod1]]} : memref<?x?xf32>, vector<4x8xf32>
// VECT: %[[B:.*]] = vector.transfer_read %{{.*}}[%[[I2]], %[[I4]]], %{{.*}} {in_bounds = [false, true], permutation_map = #[[$map_proj_d0d1_d0zero]]} : memref<?x?xf32>, vector<4x8xf32>
// VECT: %[[A:.*]] = vector.transfer_read %{{.*}}[%[[I4]], %[[I3]]], %{{.*}} {permutation_map = #[[$map_proj_d0d1_zerod1]]} : memref<?x?xf32>, vector<4x8xf32>
// VECT: %[[B:.*]] = vector.transfer_read %{{.*}}[%[[I2]], %[[I4]]], %{{.*}} {permutation_map = #[[$map_proj_d0d1_d0zero]]} : memref<?x?xf32>, vector<4x8xf32>
// VECT-NEXT: %[[C:.*]] = arith.mulf %[[B]], %[[A]] : vector<4x8xf32>
// VECT: %[[D:.*]] = vector.transfer_read %{{.*}}[%[[I2]], %[[I3]]], %{{.*}} : memref<?x?xf32>, vector<4x8xf32>
// VECT-NEXT: %[[E:.*]] = arith.addf %[[D]], %[[C]] : vector<4x8xf32>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ func.func @affine_map_with_expr_2(%arg0: memref<8x12x16xf32>, %arg1: memref<8x24
// CHECK-NEXT: %[[S1:.*]] = affine.apply #[[$MAP_ID4]](%[[ARG3]], %[[ARG4]], %[[I0]])
// CHECK-NEXT: %[[S2:.*]] = affine.apply #[[$MAP_ID5]](%[[ARG3]], %[[ARG4]], %[[I0]])
// CHECK-NEXT: %[[CST:.*]] = arith.constant 0.000000e+00 : f32
// CHECK-NEXT: %[[S3:.*]] = vector.transfer_read %[[ARG0]][%[[S0]], %[[S1]], %[[S2]]], %[[CST]] {in_bounds = [true], permutation_map = #[[$MAP_ID6]]} : memref<8x12x16xf32>, vector<8xf32>
// CHECK-NEXT: %[[S3:.*]] = vector.transfer_read %[[ARG0]][%[[S0]], %[[S1]], %[[S2]]], %[[CST]] {permutation_map = #[[$MAP_ID6]]} : memref<8x12x16xf32>, vector<8xf32>
// CHECK-NEXT: vector.transfer_write %[[S3]], %[[ARG1]][%[[ARG3]], %[[ARG4]], %[[ARG5]]] : vector<8xf32>, memref<8x24x48xf32>
// CHECK-NEXT: }
// CHECK-NEXT: }
Expand Down
2 changes: 1 addition & 1 deletion mlir/test/Dialect/Linalg/hoisting.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ func.func @hoist_vector_transfer_pairs_in_affine_loops(%memref0: memref<64x64xi3
affine.for %arg3 = 0 to 64 {
affine.for %arg4 = 0 to 64 step 16 {
affine.for %arg5 = 0 to 64 {
%0 = vector.transfer_read %memref0[%arg3, %arg5], %c0_i32 {in_bounds = [true], permutation_map = affine_map<(d0, d1) -> (0)>} : memref<64x64xi32>, vector<16xi32>
%0 = vector.transfer_read %memref0[%arg3, %arg5], %c0_i32 {permutation_map = affine_map<(d0, d1) -> (0)>} : memref<64x64xi32>, vector<16xi32>
%1 = vector.transfer_read %memref1[%arg5, %arg4], %c0_i32 : memref<64x64xi32>, vector<16xi32>
%2 = vector.transfer_read %memref2[%arg3, %arg4], %c0_i32 : memref<64x64xi32>, vector<16xi32>
%3 = arith.muli %0, %1 : vector<16xi32>
Expand Down
2 changes: 1 addition & 1 deletion mlir/test/Dialect/Linalg/vectorization.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ func.func @vectorize_dynamic_1d_broadcast(%arg0: tensor<?xf32>,
// CHECK-LABEL: @vectorize_dynamic_1d_broadcast
// CHECK: %[[VAL_3:.*]] = arith.constant 0 : index
// CHECK: %[[VAL_4:.*]] = tensor.dim %{{.*}}, %[[VAL_3]] : tensor<?xf32>
// CHECK: %[[VAL_7:.*]] = vector.transfer_read %{{.*}} {in_bounds = {{.*}}, permutation_map = #{{.*}}} : tensor<?xf32>, vector<4xf32>
// CHECK: %[[VAL_7:.*]] = vector.transfer_read %{{.*}} {permutation_map = #{{.*}}} : tensor<?xf32>, vector<4xf32>
// CHECK: %[[VAL_9:.*]] = vector.create_mask %[[VAL_4]] : vector<4xi1>
// CHECK: %[[VAL_10:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
// CHECK: %[[VAL_12:.*]] = vector.mask %[[VAL_9]] { vector.transfer_read %{{.*}} {in_bounds = [true]} : tensor<?xf32>, vector<4xf32> } : vector<4xi1> -> vector<4xf32>
Expand Down
28 changes: 18 additions & 10 deletions mlir/test/Dialect/Vector/invalid.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -454,6 +454,15 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xf32>) {

// -----

func.func @test_vector.transfer_read(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant 3.0 : f32
// expected-error@+1 {{requires a projected permutation_map (at most one dim or the zero constant can appear in each result)}}
%0 = vector.transfer_read %arg0[%c3, %c3], %cst {permutation_map = affine_map<(d0, d1)->(1)>} : memref<?x?xf32>, vector<128xf32>
}

// -----

func.func @test_vector.transfer_read(%arg0: memref<?x?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant 3.0 : f32
Expand Down Expand Up @@ -505,16 +514,6 @@ func.func @test_vector.transfer_read(%arg0: memref<?x?xvector<2x3xf32>>) {

// -----

func.func @test_vector.transfer_read(%arg0: memref<?x?xvector<2x3xf32>>) {
%c3 = arith.constant 3 : index
%f0 = arith.constant 0.0 : f32
%vf0 = vector.splat %f0 : vector<2x3xf32>
// expected-error@+1 {{requires broadcast dimensions to be in-bounds}}
%0 = vector.transfer_read %arg0[%c3, %c3], %vf0 {in_bounds = [false, true], permutation_map = affine_map<(d0, d1)->(0, d1)>} : memref<?x?xvector<2x3xf32>>, vector<1x1x2x3xf32>
}

// -----

func.func @test_vector.transfer_read(%arg0: memref<?x?xvector<2x3xf32>>) {
%c3 = arith.constant 3 : index
%f0 = arith.constant 0.0 : f32
Expand Down Expand Up @@ -618,6 +617,15 @@ func.func @test_vector.transfer_write(%arg0: memref<?x?xf32>) {

// -----

func.func @test_vector.transfer_write(%arg0: memref<?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant dense<3.0> : vector<128 x f32>
// expected-error@+1 {{requires a projected permutation_map (at most one dim or the zero constant can appear in each result)}}
vector.transfer_write %cst, %arg0[%c3, %c3] {permutation_map = affine_map<(d0, d1)->(1)>} : vector<128xf32>, memref<?x?xf32>
}

// -----

func.func @test_vector.transfer_write(%arg0: memref<?x?x?xf32>) {
%c3 = arith.constant 3 : index
%cst = arith.constant dense<3.0> : vector<3 x 7 x f32>
Expand Down
2 changes: 1 addition & 1 deletion mlir/test/Dialect/Vector/ops.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ func.func @vector_transfer_ops(%arg0: memref<?x?xf32>,
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]]], %{{.*}}, %{{.*}} : memref<?x?xf32>, vector<5xf32>
%8 = vector.transfer_read %arg0[%c3, %c3], %f0, %m : memref<?x?xf32>, vector<5xf32>
// CHECK: vector.transfer_read %{{.*}}[%[[C3]], %[[C3]], %[[C3]]], %{{.*}}, %{{.*}} : memref<?x?x?xf32>, vector<5x4x8xf32>
%9 = vector.transfer_read %arg4[%c3, %c3, %c3], %f0, %m2 {in_bounds = [false, false, true], permutation_map = affine_map<(d0, d1, d2)->(d1, d0, 0)>} : memref<?x?x?xf32>, vector<5x4x8xf32>
%9 = vector.transfer_read %arg4[%c3, %c3, %c3], %f0, %m2 {permutation_map = affine_map<(d0, d1, d2)->(d1, d0, 0)>} : memref<?x?x?xf32>, vector<5x4x8xf32>

// CHECK: vector.transfer_write
vector.transfer_write %0, %arg0[%c3, %c3] {permutation_map = affine_map<(d0, d1)->(d0)>} : vector<128xf32>, memref<?x?xf32>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -327,13 +327,11 @@ func.func @masked_permutation_xfer_read_fixed_width(
%c0 = arith.constant 0 : index
%3 = vector.mask %mask {
vector.transfer_read %dest[%c0, %c0], %cst {
in_bounds = [false, true, false],
permutation_map = affine_map<(d0, d1) -> (d1, 0, d0)>
} : tensor<?x1xf32>, vector<1x4x4xf32>
} : vector<4x1xi1> -> vector<1x4x4xf32>

"test.some_use"(%3) : (vector<1x4x4xf32>) -> ()

return
}

Expand Down
Loading

0 comments on commit da3aa03

Please sign in to comment.