forked from chapel-lang/chapel
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
GPU: Apply GPU attributes to promoted expressions (chapel-lang#24981)
This PR extends Chapel's GPU variable attribute support to include promoted expressions. Thus, the following now works as expected: ```Chapel on here.gpus[0] { @gpu.blockSize(32) @assertOnGpu A = A + 1; } ``` This was a relatively more challenging tasks, since promoted expressions create several functions with existing formals, in the middle of resolution. To achieve this, the PR: * Adds a 'gpu info' field to the `Promotion` information struct, to be used when building promotion wrappers. This field is polled for any GPU attributes that should be inserted into the bodies of promoted functions. * Adjust the promotion process to insert formals into the promotion wrappers that are needed to capture outer variables. E.g., if a promoted function is marked with `blockSize(expr)`, the free variables in `expr` need to be added as formals to all the promotion wrappers. * An implementation detail of this is that the PR exposes the 'consider for outer' method from building loop functions, to avoid creating formals for global variables, modules, etc. * To make sure that all created iterators have the same signature, the formal insertion happens before creating the additional leader and follower iterators. * Adds logic to handle the newly inserted formals to call resolution, but threading through and modifying 'actualFormals', which tracks how many actuals were passed and what formals they map to. This also involves modifying the call to the wrapper function to add the captured variables. * Adjusts the `SymbolMap` to allow transitively replacing a symbol. That is, if a substitution maps a variable A to B, and B to C, then map.get(A) now returns C. This helps the case of adding outer variables, which first involves redirecting the 'outer variable' to point to the newly-inserted formal, then redirecting it to point at a copy of the formal (when creating the leader/follower iterators). This seems a very benign change to me and I'm quite surprised no one else has run into this before. * Adds a mechanism for allowing duplicate GPU attribute calls. The problem with expressions like `A + 1 + 1` is that it's easiest to simply insert a copy of `setBlockSize` for each underlying `forall` loop / promoted expression. However, this ends up creating several copies of `setBlockSize`, all from the same source. This seems benign, since they ought to evaluate to the same thing, and since the GPU transformations only pick one copy anyway (so, no duplicated side effects). To work around this, calls to the block size primitive can now optionally include a second argument for 'unique identifier' If two block size calls are found to "conflict", but have the same unique identifier, an error is not emitted. Thus, `blockSize` copies inserted into nested promoted expressions by the GPU attribute do not cause problems. Reviewed by @e-kayrakli and @ShreyasKhandekar -- thanks! # Testing - [x] paratest - [x] `test/gpu/native`
- Loading branch information
Showing
21 changed files
with
331 additions
and
239 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.