Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is a nvcc error when I install apex #50

Open
JuntingLee opened this issue Jan 3, 2022 · 2 comments
Open

There is a nvcc error when I install apex #50

JuntingLee opened this issue Jan 3, 2022 · 2 comments

Comments

@JuntingLee
Copy link

JuntingLee commented Jan 3, 2022

I ran the commond:
python setup.py install --cuda_ext --cpp_ext
and got



torch.__version__  = 1.4.0


setup.py:107: UserWarning: Option --pyprof not specified. Not installing PyProf dependencies!
  warnings.warn("Option --pyprof not specified. Not installing PyProf dependencies!")

Compiling cuda extensions with
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
from /usr/local/cuda-10.1/bin

running install
running bdist_egg
running egg_info
writing apex.egg-info/PKG-INFO
writing dependency_links to apex.egg-info/dependency_links.txt
writing top-level names to apex.egg-info/top_level.txt
reading manifest file 'apex.egg-info/SOURCES.txt'
adding license file 'LICENSE'
writing manifest file 'apex.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
building 'fused_layer_norm_cuda' extension
gcc -pthread -B /home/bob/anaconda3/envs/metro/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/bob/anaconda3/envs/metro/lib/python3.7/site-packages/torch/include -I/home/bob/anaconda3/envs/metro/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/bob/anaconda3/envs/metro/lib/python3.7/site-packages/torch/include/TH -I/home/bob/anaconda3/envs/metro/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda-10.1/include -I/home/bob/anaconda3/envs/metro/include/python3.7m -c csrc/layer_norm_cuda.cpp -o build/temp.linux-x86_64-3.7/csrc/layer_norm_cuda.o -O3 -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=fused_layer_norm_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/local/cuda-10.1/bin/nvcc -I/home/bob/anaconda3/envs/metro/lib/python3.7/site-packages/torch/include -I/home/bob/anaconda3/envs/metro/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/bob/anaconda3/envs/metro/lib/python3.7/site-packages/torch/include/TH -I/home/bob/anaconda3/envs/metro/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda-10.1/include -I/home/bob/anaconda3/envs/metro/include/python3.7m -c csrc/layer_norm_cuda_kernel.cu -o build/temp.linux-x86_64-3.7/csrc/layer_norm_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -maxrregcount=50 -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=fused_layer_norm_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_61,code=sm_61 -std=c++11
csrc/layer_norm_cuda_kernel.cu:4:10: fatal error: ATen/cuda/DeviceUtils.cuh: No such file or directory
 #include "ATen/cuda/DeviceUtils.cuh"
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
error: command '/usr/local/cuda-10.1/bin/nvcc' failed with exit status 1

I have installed CUDA10.1, ran commond:
which nvcc
and I got
/usr/local/cuda-10.1/bin/nvcc

@kevinlin311tw
Copy link
Member

I am not sure if NVIDIA Apex has any updates recently.

One suggestion is to skip apex installation. For some reasons, we observed mix-precision training is somehow slow. We think probably there are issues when running PyTorch1.4 with Apex.

@amogh112
Copy link

Hey, you need to go to previous version of apex to install because of changes in PyTorch as given here :
https://issueexplorer.com/issue/NVIDIA/apex/1200
git checkout f3a960f80244cf9e80558ab30f7f7e8cbf03c0a0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants