Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: /home/xxx/.cache/keops2.1.1/build/nvrtc_jit.so: cannot open shared object file: No such file or directory #10

Open
hanlaoshi opened this issue May 8, 2024 · 2 comments

Comments

@hanlaoshi
Copy link

Hi, there, how should I resolve this issue below?

Traceback (most recent call last):
File "", line 1, in
File "/home/newdisk/ai/anaconda3/envs/tsdiff/lib/python3.8/site-packages/pykeops/numpy/test_install.py", line 20, in test_numpy_bindings
if np.allclose(my_conv(x, y).flatten(), expected_res):
File "/home/newdisk/ai/anaconda3/envs/tsdiff/lib/python3.8/site-packages/pykeops/numpy/generic/generic_red.py", line 303, in call
self.myconv = keops_binder["nvrtc" if tagCPUGPU else "cpp"](
File "/home/newdisk/ai/anaconda3/envs/tsdiff/lib/python3.8/site-packages/keopscore/utils/Cache.py", line 68, in call
obj = self.cls(*args)
File "/home/newdisk/ai/anaconda3/envs/tsdiff/lib/python3.8/site-packages/pykeops/common/keops_io/LoadKeOps_nvrtc.py", line 15, in init
super().init(*args, fast_init=fast_init)
File "/home/newdisk/ai/anaconda3/envs/tsdiff/lib/python3.8/site-packages/pykeops/common/keops_io/LoadKeOps.py", line 18, in init
self.init(*args)
File "/home/newdisk/ai/anaconda3/envs/tsdiff/lib/python3.8/site-packages/pykeops/common/keops_io/LoadKeOps.py", line 126, in init
) = get_keops_dll(
File "/home/newdisk/ai/anaconda3/envs/tsdiff/lib/python3.8/site-packages/keopscore/utils/Cache.py", line 27, in call
self.library[str_id] = self.fun(*args)
File "/home/newdisk/ai/anaconda3/envs/tsdiff/lib/python3.8/site-packages/keopscore/get_keops_dll.py", line 110, in get_keops_dll_impl
map_reduce_obj = map_reduce_class(red_formula_string, aliases, *args)
File "/home/newdisk/ai/anaconda3/envs/tsdiff/lib/python3.8/site-packages/keopscore/mapreduce/gpu/GpuReduc1D.py", line 17, in init
Gpu_link_compile.init(self)
File "/home/newdisk/ai/anaconda3/envs/tsdiff/lib/python3.8/site-packages/keopscore/binders/nvrtc/Gpu_link_compile.py", line 54, in init
self.my_c_dll = CDLL(jit_compile_dll(), mode=RTLD_LAZY)
File "/home/newdisk/ai/anaconda3/envs/tsdiff/lib/python3.8/ctypes/init.py", line 373, in init
self._handle = _dlopen(self._name, mode)
OSError: /home/newdisk/ai/.cache/keops2.1.1/build/nvrtc_jit.so: cannot open shared object file: No such file or directory

@marcelkollovieh
Copy link
Contributor

Hi,
This looks like a cuda issue. Can you delete the cache and try again?

rm -rf /home/newdisk/ai/.cache/keops*

Can you also check whether nvcc is available?

nvcc -V

@hanlaoshi
Copy link
Author

Hi, This looks like a cuda issue. Can you delete the cache and try again?

rm -rf /home/newdisk/ai/.cache/keops*

Can you also check whether nvcc is available?

nvcc -V

Hello! I've followed your suggestion to clear the cache and checked the output of 'nvcc -V', but the issue persisted.

(tsdiff) nvcc -V

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

Therefore, I uninstalled pykeops, and that resolved the problem. So, does pykeops affect the results of the tsdiff model?

Currently, I am trying to apply tsdiff to multivariate time series for testing. Using GluonTS, the training process proceeds without issues, but I encounter problems during the evaluation phase. For example, with the solar_nips dataset, the following line of code causes an issue:

forecasts = list(tqdm(forecast_it, total=len(transformed_testdata)))

The problem arises due to a shape mismatch: data["future_target"] has the shape torch.Size([64, 24]), while scaled has the shape torch.Size([64, 1, 137]). During debugging, I found that in the training phase, the shape of data["future_target"] is torch.Size([64, 24, 137]), which matches the shape of scaled. Could you advise on how to modify the code so that tsdiff can be adapted to a multivariate series environment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants