Skip to content

Commit

Permalink
Tune insert/extract subvector cost
Browse files Browse the repository at this point in the history
Change-Id: I5c18aa058ff272ca399687ef73eb7ead427f1e76
  • Loading branch information
jrbyrnes committed May 20, 2024
1 parent 128bd38 commit bb0f618
Show file tree
Hide file tree
Showing 3 changed files with 41 additions and 40 deletions.
5 changes: 3 additions & 2 deletions llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1159,9 +1159,10 @@ InstructionCost GCNTTIImpl::getShuffleCost(TTI::ShuffleKind Kind,
}
case TTI::SK_ExtractSubvector:
case TTI::SK_InsertSubvector: {
if (HasVOP3P && NumVectorElts == 2)
// Even aligned accesses are free
if (!(Index % 2))
return 0;
// Insert/extract subvectors require only shifts / extract code to get the
// Insert/extract subvectors only require shifts / extract code to get the
// relevant bits
return alignTo(RequestedElts, 2) / 2;
}
Expand Down
Loading

0 comments on commit bb0f618

Please sign in to comment.