[AMD] Added instr.sched guards for the FA-like kernels #5163

ravil-mobile · 2024-11-15T13:28:58Z

Extended AMDGPU instruction scheduling for the Flash Attention like kernels. The introduced source code changes adds sched.barriers at the beginning and at the end of each scf.For op (called guards) which contains at least 2 tt.Dot, tt.reduce and at least one math::Exp2Op ops. The guards prevent moves of instructions from basic block adjacent to the bodies for for-loops. According to test results, it results in increase performance for the FA kernels due to a reduction of VGPRs spilling.

I am not making a trivial change, such as fixing a typo in a comment.
I have written a PR description following these
rules.
I have run pre-commit run --from-ref origin/main --to-ref HEAD.
Select one of the following.
- I have added tests.
  - /test for lit tests
  - /unittest for C++ tests
  - /python/test for end-to-end tests
- This PR does not need a test because I did the source code refactoring. The current tests are supposed to be enough
Select one of the following.
- I have not added any lit tests.
- The lit tests I have added follow these best practices,
  including the "tests should be minimal" section. (Usually running Python code
  and using the instructions it generates is not minimal.)

ravil-mobile force-pushed the ravil/fa-sched branch from 0e3ea45 to 1c4a34e Compare November 15, 2024 14:56

[AMD] Added instr.sched guards for the FA-like kernels

2f4d48f

ravil-mobile force-pushed the ravil/fa-sched branch from 1c4a34e to 2f4d48f Compare November 15, 2024 16:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD] Added instr.sched guards for the FA-like kernels #5163

[AMD] Added instr.sched guards for the FA-like kernels #5163

ravil-mobile commented Nov 15, 2024

[AMD] Added instr.sched guards for the FA-like kernels #5163

Are you sure you want to change the base?

[AMD] Added instr.sched guards for the FA-like kernels #5163

Conversation

ravil-mobile commented Nov 15, 2024