Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Enable unaligned scratch accesses #110219

Merged
merged 3 commits into from
Oct 11, 2024

Commits on Oct 10, 2024

  1. [AMDGPU] Enable unaligned scratch accesses

    This allows us to emit wide generic and scratch memory accesses when we do not
    have alignment information. In cases where accesses happen to be properly
    aligned or where generic accesses do not go to scratch memory, this improves
    performance of the generated code by a factor of up to 16x and reduces code
    size, especially when lowering memcpy and memmove intrinsics.
    
    Also: Make the use of the FeatureUnalignedScratchAccess feature more
    consistent: Code has already assumed that unaligned accesses with the
    specialized flat scratch instructions are allowed independent of
    FeatureUnalignedScratchAccess at some places. This patch always uses this
    interpretation.
    
    Part of SWDEV-455845.
    ritter-x2a committed Oct 10, 2024
    Configuration menu
    Copy the full SHA
    98a3ecf View commit details
    Browse the repository at this point in the history
  2. fixup! [AMDGPU] Enable unaligned scratch accesses

    make flat scratch not imply that unaligned scratch accesses are valid
    ritter-x2a committed Oct 10, 2024
    Configuration menu
    Copy the full SHA
    422514e View commit details
    Browse the repository at this point in the history
  3. fixup! fixup! [AMDGPU] Enable unaligned scratch accesses

    add FeatureUnalignedScratchAccess to gfx12
    ritter-x2a committed Oct 10, 2024
    Configuration menu
    Copy the full SHA
    c128ad8 View commit details
    Browse the repository at this point in the history