-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should 128-bit bit-shift/rotation operators be added? #5
Comments
Digging into this a bit, here's an example of what various native platforms generate for bit-shifting operations. This is the wasm LLVM emits, lightly hand-edited to return two i64 values instead of storing them to memory, and that also showcases what Wasmtime generates for native code today. Given this it looks like there's significant room for improvement either in Wasmtime or the possibility of adding these operations to this proposal itself. It would be best to validate with a benchmark, however, that these operations are indeed significantly faster with the native lowerings to justify adding them. |
Actually no I take back what I said, the assemblies of aarch64 and riscv64 look pretty similar to what the native wasm produces. Only x86_64 seems significantly smaller here through its use of So I would update my hypothesis here to aarch64/riscv64 are unlikely to show improvements and x86_64 will likely show improvements for Wasmtime, and maybe other engines too. This should of course be verified, however. |
The numbers in #2 (comment) sort of confirm the above hypothesis. On x64 wasm is ~100% slower than native but on aarch64 it's 35% slower than native (numbers for Wasmtime). The 35% number is likely more in the ballpark of "the general delta between Wasmtime and native" rather than specifically related to the shift benchmark in question. On investigation of the native x64 benchmark though the source code for the algorithm doesn't use simd but the generated code is using vector shift/shuffle instructions. I didn't see |
Upon investigating this more I've confirmed that LLVM, for native, is lowering to SIMD bits for the core left/right shift algorithms. If I enable |
Summarize #5 and write down some words for this in the overview.
I'm going to close this as "no" with the summary here |
This was bought up at the last CG meeting and wasn't originally evaluated for this proposal. The question is if 128-bit shift-and-rotate operators should be added (IIRC, please correct me if I'm wrong). This would perhaps be
i64.{shl,shr_s,shr_u,rotl,rotr}128
for example.Performance and generated code should be evaluated for these operations today in comparison with what native platforms do. Ideally a benchmark or microbenchmark could be created to compare before/after performance of hypothetical operations.
The text was updated successfully, but these errors were encountered: