Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transform] mlir: named op layout propagation #101

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

yifeizh2
Copy link
Contributor

@yifeizh2 yifeizh2 commented May 27, 2024

Add layout propagation related passes.

Tracking: #42

@yifeizh2 yifeizh2 added WIP work in progress DO NOT MERGE labels May 27, 2024
@yifeizh2 yifeizh2 linked an issue Jun 2, 2024 that may be closed by this pull request
@yifeizh2 yifeizh2 changed the title mlir: named op layout propagation [Transform] mlir: named op layout propagation Jun 6, 2024
@yifeizh2 yifeizh2 changed the base branch from main to zhicong/deep_tile_matmul July 30, 2024 05:23
@zhczhong zhczhong force-pushed the zhicong/deep_tile_matmul branch 4 times, most recently from b3bf8dc to 23dfa97 Compare August 1, 2024 01:27
@zhczhong zhczhong force-pushed the zhicong/deep_tile_matmul branch 3 times, most recently from 21b4dfe to 5765bc7 Compare August 6, 2024 03:00
@yifeizh2 yifeizh2 added ready to review and removed WIP work in progress DO NOT MERGE labels Sep 18, 2024
Copy link
Contributor

@kurapov-peter kurapov-peter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've put some comments. Haven't gone through all of it yet. A general question I have is how expensive the analysis is?

@@ -5,6 +5,7 @@ gc_set_mlir_link_components(MLIR_LINK_COMPONENTS
gc_add_mlir_library(GcAnalysis
TargetDescriptionAnalysis.cpp
MatmulConfigAnalysis.cpp
GlobalAnalysis.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please give it a more descriptive name?

Comment on lines +34 to +37
} else if (isa<tensor::ExpandShapeOp, tensor::CollapseShapeOp, tensor::PadOp>(
op))
return true;
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else if (isa<tensor::ExpandShapeOp, tensor::CollapseShapeOp, tensor::PadOp>(
op))
return true;
return false;
}
return isa<tensor::ExpandShapeOp, tensor::CollapseShapeOp, tensor::PadOp>(op);

return false;
}

bool hasAllTensorSemantics(linalg::LinalgOp linalgOp) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be available in some utils already?

];
}

def PostProcessPackUnpack : Pass<"post-process-pack-unpack"> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be a separate pass if it is a post-processing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add tests with the smallest patterns possible just to capture the functionality for all passes and their aspects. They also help to understand what passes do.

Comment on lines +221 to +225
// copied from mlir
static SmallVector<int64_t>
projectToInnerMostNonUnitDimsPos(ArrayRef<int64_t> dimsPos,
ArrayRef<ReassociationIndices> reassocIndices,
ArrayRef<int64_t> targetShape) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move it into utils in the upstream?

Comment on lines +254 to +256
// if forceBlocking is set to true, we will unconditionally convert
// input/weight/output to blocking layout; otherwise we follow the default
// heuristic logic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// if forceBlocking is set to true, we will unconditionally convert
// input/weight/output to blocking layout; otherwise we follow the default
// heuristic logic
// if forceBlocking is set to true, we will unconditionally convert
// input/weight/output to blocking layout; otherwise we follow the default
// heuristic logic, which is ?...

Comment on lines +301 to +317
if (constantA || curInputLayouts[0].isBlocking() || (M % iim) || (K % iik) ||
(elementType.isBF16() &&
curInputLayouts[0] == TensorLayout({1, 0}, {}, {}))) {
ALayouts.emplace_back(
APackInfo.first, APackInfo.second,
SmallVector<OpFoldResult>{rewriter.getIndexAttr(iim),
rewriter.getIndexAttr(iik)});
} else {
ALayouts.emplace_back(APackInfo.first, SmallVector<int64_t>{},
SmallVector<OpFoldResult>{});
}
if (constantB || curInputLayouts[1].isBlocking() || K % iik || N % iin ||
elementType.isBF16()) {
BLayouts.emplace_back(
BPackInfo.first, BPackInfo.second,
SmallVector<OpFoldResult>{rewriter.getIndexAttr(iik),
rewriter.getIndexAttr(iin)});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can there be some abstraction here to avoid platform specificity?

Comment on lines +216 to +217
} else if (isa<linalg::SoftmaxOp>(linalgOp) || isa<linalg::MapOp>(linalgOp) ||
isa<linalg::YieldOp>(linalgOp) || isa<linalg::IndexOp>(linalgOp)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it fail on generics too?

Comment on lines +521 to +523
if (failed(packVNNIMMT4D(rewriter, linalgOp)))
return failure();
return success();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (failed(packVNNIMMT4D(rewriter, linalgOp)))
return failure();
return success();
return packVNNIMMT4D(rewriter, linalgOp);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Develop MLIR-based layout propagation passes
3 participants