Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

256-bit AVX intrinsics support #21684

Open
jiepan-intel opened this issue Apr 3, 2024 · 2 comments
Open

256-bit AVX intrinsics support #21684

jiepan-intel opened this issue Apr 3, 2024 · 2 comments
Labels

Comments

@jiepan-intel
Copy link
Contributor

Compile existing x86 SSE/AVX SIMD code into WASM SIMD is very attractive, developer can reuse existing library without rewrite it.
However currently only 128-bit subset of the AVX intrinsics are supported, many existing code cannot meet this restriction.
Adding 256-bit AVX intrinsics support will expand the applicable scenarios and may also increase performance.
Does emscripten have a plan for this?

Currently Google Highway supports WASM_EMU256 (a 2x unrolled version of wasm128) target,
A re-vectorize optimization phase is being developed in Google V8 JS engine, which can pack two SIMD128 nodes into one SIMD256 node.

Sample code for AVX intrinsics support:

typedef struct Vec256 {
  __m128 v0;
  __m128 v1;
}__m256;

static __inline__ __m256 __attribute__((__always_inline__, __nodebug__))
_mm256_add_ps(__m256 __a, __m256 __b) {
    __m256 c;
    c.v0 = (__m128)wasm_f32x4_add((v128_t)__a.v0, (v128_t)__b.v0);
    c.v1 = (__m128)wasm_f32x4_add((v128_t)__a.v1, (v128_t)__b.v1);
    return c;
}
@sbc100 sbc100 added the SIMD label Apr 3, 2024
@tlively
Copy link
Member

tlively commented Apr 5, 2024

Given our precedent for providing emulation of SSE and Neon intrinsics, it seems reasonable to provide emulation for AVX intrinsics as well. We don't have any work planned for this, but contributions would be welcome.

jiepan-intel added a commit to jiepan-intel/emscripten that referenced this issue Aug 19, 2024
Since webassenbly only supports 128-bit fixed vector length, one 256-bit AVX intrinsic is
emulated by two 128-bit instrinsics.
jiepan-intel added a commit to jiepan-intel/emscripten that referenced this issue Aug 19, 2024
Since webassenbly only supports 128-bit fixed vector length, one 256-bit AVX intrinsic is
emulated by two 128-bit instrinsics.
jiepan-intel added a commit to jiepan-intel/emscripten that referenced this issue Aug 22, 2024
Since webassembly only supports 128-bit fixed vector length, one 256-bit AVX intrinsic is
emulated by two 128-bit instrinsics.
@jkl1337
Copy link

jkl1337 commented Nov 5, 2024

The current header fails to work with the C compiler, proposed fix in #22850

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants