256-bit AVX intrinsics support #21684

jiepan-intel · 2024-04-03T03:49:58Z

Compile existing x86 SSE/AVX SIMD code into WASM SIMD is very attractive, developer can reuse existing library without rewrite it.
However currently only 128-bit subset of the AVX intrinsics are supported, many existing code cannot meet this restriction.
Adding 256-bit AVX intrinsics support will expand the applicable scenarios and may also increase performance.
Does emscripten have a plan for this?

Currently Google Highway supports WASM_EMU256 (a 2x unrolled version of wasm128) target,
A re-vectorize optimization phase is being developed in Google V8 JS engine, which can pack two SIMD128 nodes into one SIMD256 node.

Sample code for AVX intrinsics support:

typedef struct Vec256 {
  __m128 v0;
  __m128 v1;
}__m256;

static __inline__ __m256 __attribute__((__always_inline__, __nodebug__))
_mm256_add_ps(__m256 __a, __m256 __b) {
    __m256 c;
    c.v0 = (__m128)wasm_f32x4_add((v128_t)__a.v0, (v128_t)__b.v0);
    c.v1 = (__m128)wasm_f32x4_add((v128_t)__a.v1, (v128_t)__b.v1);
    return c;
}

The text was updated successfully, but these errors were encountered:

tlively · 2024-04-05T00:05:27Z

Given our precedent for providing emulation of SSE and Neon intrinsics, it seems reasonable to provide emulation for AVX intrinsics as well. We don't have any work planned for this, but contributions would be welcome.

Since webassenbly only supports 128-bit fixed vector length, one 256-bit AVX intrinsic is emulated by two 128-bit instrinsics.

Since webassembly only supports 128-bit fixed vector length, one 256-bit AVX intrinsic is emulated by two 128-bit instrinsics.

jkl1337 · 2024-11-05T08:33:40Z

The current header fails to work with the C compiler, proposed fix in #22850

sbc100 added the SIMD label Apr 3, 2024

jiepan-intel added a commit to jiepan-intel/emscripten that referenced this issue Aug 19, 2024

Add 256-bit AVX support (emscripten-core#21684)

8edf31b

Since webassenbly only supports 128-bit fixed vector length, one 256-bit AVX intrinsic is emulated by two 128-bit instrinsics.

jiepan-intel added a commit to jiepan-intel/emscripten that referenced this issue Aug 19, 2024

Add 256-bit AVX support (emscripten-core#21684)

1024bfd

Since webassenbly only supports 128-bit fixed vector length, one 256-bit AVX intrinsic is emulated by two 128-bit instrinsics.

jiepan-intel added a commit to jiepan-intel/emscripten that referenced this issue Aug 22, 2024

Add 256-bit AVX support (emscripten-core#21684)

ef70f48

Since webassembly only supports 128-bit fixed vector length, one 256-bit AVX intrinsic is emulated by two 128-bit instrinsics.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

256-bit AVX intrinsics support #21684

256-bit AVX intrinsics support #21684

jiepan-intel commented Apr 3, 2024

tlively commented Apr 5, 2024

jkl1337 commented Nov 5, 2024

256-bit AVX intrinsics support #21684

256-bit AVX intrinsics support #21684

Comments

jiepan-intel commented Apr 3, 2024

tlively commented Apr 5, 2024

jkl1337 commented Nov 5, 2024