Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: More Optimizations and SIMD fixes for MSVC & ARM #413

Draft
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

recp
Copy link
Owner

@recp recp commented Apr 6, 2024

  • [WIP] More SIMD optimizations
    • Matrix invert
    • Non-Square matrices
    • Transforms
    • AABB
    • Frustum
    • simd for int types
    • ...
  • Fix compiling on MSVC + ARM32 ( dont align types on MSVC + ARM32 due to "719: formal parameter with requested alignment of 16 won't be aligned" )
  • msvc, simd: fix simd headers for _M_ARM64EC
  • arm, neon: fix neon support on GCC ARM
  • Try interleave independent instructions to take advantages of ILP if possible ( compilers may do this already but manually giving the hint is nice )
  • Try reduce port pressure where possible e.g. use some _mm_blend_ps instead lot of _mm_shuffle_ps ( this step may take a time also needs to be profiled e.g Intel VTune can be used to see the bottleneck + speed test... ). Maybe on another PRs...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant