[Build] compilation error: invalid instruction mnemonic 'vcvtneeph2ps' #22519

saiden89 · 2024-10-21T09:55:13Z

Describe the issue

I am attempting to compile ONNX Runtime on the LUMI supercomputer, a Cray system.

The configuration step is completed without any issues. However, during the compile phase, I encountered problems when using the default CC and cc Cray compiler wrappers, which apply Cray-specific optimizations. To bypass this, I manually specified the AMD compilers (amdclang and amdclang++) instead of the wrappers.

System Details:

GPU: AMD MI250X (gfx90a)
CPU: AMD EPYC 7A53 "Trento"

Now, I’m encountering a compile-time error possibly related to the AVX512 instruction set: error: invalid instruction mnemonic 'vcvtneeph2ps', but I’m not familiar enough with all this to diagnose the issue. I would appreciate any guidance on how to address this.

Urgency

Not urgent, but would be nice to have since I have a big inference job on a project.

Target platform

AMD MI250X

Build script

The build script relies on some specific modules being loaded to target the correct architecture, as well as loading the correct programming environment. Full reproducibility might be limited because of the exotic nature of the system, but I am more than happy to try myself any suggestions.

module purge

module load PrgEnv-amd
module load rocm/6.0.3
module load craype-accel-amd-gfx90a craype-x86-trento

cd /tmp || exit
git clone --single-branch --branch main --recursive https://github.com/Microsoft/onnxruntime onnxruntime
cd onnxruntime || exit

mamba install rust -y
pip install cmake

./build.sh --config Release \
    --build_wheel \
    --update \
    --build \
    --parallel \
    --use_rocm \
    --rocm_home "$ROCM_PATH" \
    --cmake_extra_defines CMAKE_HIP_ARCHITECTURES=gfx90a \
    --cmake_extra_defines CMAKE_C_COMPILER=amdclang \
    --cmake_extra_defines CMAKE_CXX_COMPILER=amdclang++

pip install build/Linux/Release/dist/*

Error / output

log.txt

Visual Studio Version

No response

GCC / Compiler Version

AMD clang version 17.0.0 (https://github.com/RadeonOpenCompute/llvm-project roc-6.0.3 24012 af27734ed982b52a9f1be0f035ac91726fc697e4)

The text was updated successfully, but these errors were encountered:

edgchen1 · 2024-10-21T17:27:21Z

Here's the error from the log for convenience:

Building ASM object CMakeFiles/onnxruntime_mlas.dir/tmp/onnxruntime/onnxruntime/core/mlas/lib/x86_64/cvtfp16Avx.S.o
/tmp/onnxruntime/onnxruntime/core/mlas/lib/x86_64/cvtfp16Avx.S:60:9: error: invalid instruction mnemonic 'vcvtneeph2ps'
        vcvtneeph2ps ymm0, ymmword PTR [rdi]
        ^~~~~~~~~~~~

I think this code was added in this PR:
#21183

@eralmual do you have any pointers on how to fix this?

eralmual · 2024-10-21T21:43:24Z

Hi! Thank you for reaching out!

Seems like the vcvtneeph2ps instruction not recognized by the compiler, I did a quick search and the instruction is supported on Clang since v16.0 as part of the AVX-NE-CONVERT ISA, seems like you are using v17.0 so it should work fine.

If it's not working for Clang in general I can do a quick patch to prevent the compiler error while we find a solution, just let me know.
In the meanwhile i think you should be able to safely delete the if and everything inside at line

onnxruntime/cmake/onnxruntime_mlas.cmake

Line 574 in c7138a2

if(CMAKE_CXX_COMPILER_VERSION GREATER_EQUAL 13.1 AND NOT(APPLE))

and that should fix the compiler issue.

Let me know if it works!

snnn · 2024-10-22T02:38:05Z

It is more about if your Assembler(like gas) can recognize this instruction. We should write a test program to check it: https://cmake.org/cmake/help/latest/module/CheckSourceCompiles.html, instead of detecting compiler name/version.

Contributions are welcomed

saiden89 · 2024-10-23T13:56:42Z

Thank you @eralmual for the suggestion, your proposed solution solves the problem. However, as the compilation continues I am greeted by a lot more errors.

/pfs/lustrep2/projappl/project_465000941/compartments/onnxruntime/build/Linux/Release/amdgpu/onnxruntime/core/providers/rocm/tensor/cast_op.cc:295:1: error: explicit instantiation of 'ComputeInternal' that occurs after an explicit specialization has no effect [-Werror,-Winstantiation-after-specialization]
SPECIALIZE_IMPL(MLFloat16)

/pfs/lustrep2/projappl/project_465000941/compartments/onnxruntime/onnxruntime/core/providers/rocm/nn/conv_impl.cu:24:21: error: implicit conversion loses integer precision: 'size_t' (aka 'unsigned long') to 'int' [-Werror,-Wshorten-64-to-32]
  fast_divmod fdm_c(bias_size);
              ~~~~~ ^~~~~~~~~

Any further insights are deeply appreciated, thanks!

snnn · 2024-10-23T17:05:54Z

Please add "--compile_no_warning_as_error" to your build command.

snnn · 2024-10-23T17:07:20Z

We don't use clang to build our CUDA code. Therefore we didn't see such warnings. You can help us fix them or suppress them if you'd like. Contributions are welcome. Thanks.

saiden89 added the build build issues; typically submitted using template label Oct 21, 2024

github-actions bot added the ep:ROCm questions/issues related to ROCm execution provider label Oct 21, 2024

snnn added contributions welcome lower priority issues for the core ORT teams and removed ep:ROCm questions/issues related to ROCm execution provider labels Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Build] compilation error: invalid instruction mnemonic 'vcvtneeph2ps' #22519

[Build] compilation error: invalid instruction mnemonic 'vcvtneeph2ps' #22519

saiden89 commented Oct 21, 2024

edgchen1 commented Oct 21, 2024

eralmual commented Oct 21, 2024

snnn commented Oct 22, 2024

saiden89 commented Oct 23, 2024

snnn commented Oct 23, 2024

snnn commented Oct 23, 2024

[Build] compilation error: invalid instruction mnemonic 'vcvtneeph2ps' #22519

[Build] compilation error: invalid instruction mnemonic 'vcvtneeph2ps' #22519

Comments

saiden89 commented Oct 21, 2024

Describe the issue

System Details:

Urgency

Target platform

Build script

Error / output

Visual Studio Version

GCC / Compiler Version

edgchen1 commented Oct 21, 2024

eralmual commented Oct 21, 2024

snnn commented Oct 22, 2024

saiden89 commented Oct 23, 2024

snnn commented Oct 23, 2024

snnn commented Oct 23, 2024