Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transforms] Add constant_tensors_folding pass #74

Open
wants to merge 40 commits into
base: main
Choose a base branch
from

Conversation

niuxiaog
Copy link
Contributor

@niuxiaog niuxiaog commented May 15, 2024

This PR implements constant_tensors_folding pass (RFC: PR #183) in issue #56 and issue #146:

  • The input MLIR function entry will be split into two functions: entry() and runtime_fold(). The runtime_fold() function contains the constant operations whose input tensors and output tensors are all constant. The new entry() function contains other operations that depend on the variable values and folded constant tensors.
  • When needed, compile_time_fold() can be enabled to fold constant tensors during compile time.
  • A constant tensor may be fully folded by several sequential operations. If fully folding increases the data size dramatically (i.e., by BroadcastOp), we choose a partial folding.
  • Swaps the constant BroadcastOp and the constant operations after it, so that these constant operations can be included into partial folding. Currently there are strong constraints on these ops to ensure correctness.
  • Necessary information for runtime execution is added as GlobalOp to the MLIR module.
  • During the pass, the buffers for storing folded tensors are allocated using the APIs provided by the constant cache manager, which will be implemented by another PR [Runtime] Constant cache manager and runtime pipeline #342.

@niuxiaog niuxiaog force-pushed the xgniu/constant_weights_folding branch from acf3ae8 to 94f2813 Compare June 3, 2024 06:39
@niuxiaog niuxiaog force-pushed the xgniu/constant_weights_folding branch from a0ddebe to d7663a5 Compare June 4, 2024 03:00
@niuxiaog niuxiaog force-pushed the xgniu/constant_weights_folding branch from 387523a to d8d2d79 Compare August 20, 2024 06:20
@niuxiaog niuxiaog requested review from zhczhong, Menooker, AndreyPavlenko, ciyongch and ZhennanQin and removed request for zhczhong September 14, 2024 03:42
@niuxiaog niuxiaog changed the title [Transforms] Add constant_weights_folding pass [Transforms] Add constant_tensors_folding pass Sep 14, 2024
void ConstantSubgraphAnalysis::runOnOperation() {
Operation *op = getOperation();
auto &func =
op->getRegions().front().getBlocks().front().getOperations().front();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any shortcut for this kind of operation?

@@ -53,6 +53,8 @@ void populateTensorPasses(mlir::OpPassManager &pm) {
// todo: padding propagation pass
// todo: layout propagation pass
// todo: tensor constant propagation pass
pm.addPass(createConstantSubgraphAnalysisPass());
pm.addPass(createConstantTensorFoldingPass());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we combine these two passes into one, and provide an option to do the analysis only is needed?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the same. Maybe we should put the analysis into the pass, unless the const-subgraph-analysis is needed by more than one pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will put them into to one.

@niuxiaog niuxiaog linked an issue Sep 24, 2024 that may be closed by this pull request
@niuxiaog
Copy link
Contributor Author

Though the PR is ready_to_review, we need PR #342 and benchgc's support to fully enable this feature. For OpenVINO integration, this feature can be enabled by modifying file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

const weight packing support
3 participants