-
Notifications
You must be signed in to change notification settings - Fork 11.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mlir][gpu] Add address space modifier to Barrier #110527
Open
FMarno
wants to merge
1
commit into
llvm:main
Choose a base branch
from
FMarno:gpu_barrier_memfence
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+71
−11
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it mean for memory to be visible? What memory accesses is this referencing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might suggest "completed" or "committed" as synonyms here, if I've understood the semantics right.
That is, any memory operations involving the given memory spaces that were issued before the barrier must have their results reflected after the barrier completes.
Or, another way, any operation touching the listed memory spaces must be happens-before with any operation on those memory spaces after the barrier.
(This would allow us to not necessarily have
amdgpu.lds_barrier
- since we'd be able to express that asgpu.barrier [#gpu.address_space<workgroup>]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not trying to say that the memory is visible, but that the memory accesses are visible. I think it makes sense when paired with the previous paragraph
I think what @krzysz00 is saying matches.
I'm open to suggestions on how to make it more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some good points there, @krzysz00. I only know the amdgpu implementation and that's why I'd like to make sure the semantics make sense for everyone in a portable way.
The way I understand it, it's backend compiler's job to track any memory dependencies / synchronization within the same subgroup, and we only need
gpu.barrier
(the plain version) when the dependency can be across subgroups.I don't see how this aligns with the stated goal in the RFC:
lds_barrier is strictly more work than just gpu.barrier, no? It's
s_barrier
and 'please flush the shared memory fifo' . At least that's what c++ libraries do: https://github.com/ROCm/composable_kernel/blob/de3e3b642402eac5b4a466f6a2fa5e9f022ba680/include/ck/utility/synchronization.hpp#L20-L25.Maybe I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gpu.barrier
is stronger thanamdgpu.lds_barrier
. The current semantics ofgpu.barrier
are not AMD'ss_barrier
, they're(atomic fence) + s_barrier
.(I think there are are even explicit calls to LLVM memory fencing in the AMD lowering path for
gpu.barrier
).That is,
gpu.barrier
imposes a happens-before over all GPU memory by default, and this patch lets people loosen that.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and those libraries are implementing
amdgpu.lds_barrirer
. If they wanted to matchgpu.barrier
, they'd be waiting onvmcnt(0)
as wellThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what this means either. I don't want to be pedantic here, but this is just not the wording I'm used to / have seen in this context. I did some searching and found this thread that talks about something adjacent: memory access availability vs. visibility https://community.khronos.org/t/difference-between-availability-and-visibility/7401/2.
Maybe something like this (I reworded a few places):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that description basically covers it, but I would move a couple things around. Also I wouldn't say that
memfence
necessary weakens the memory fencing requirement, but it can be used for that.