You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to run the cluster_demo.py on EC2. The instance starts fine but gets terminated shortly after. I get the following traceback in the stdout.log
sync initiated log sync initiated Running in docker I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:126] Couldn't open CUDA library libcuda.so.1. LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: 83526cf8e682 I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: Permission denied: could not open driver version path for reading: /proc/driver/nvidia/version I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1065] LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64 I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1066] failed to find libcuda.so on this system: Failed precondition: could not dlopen DSO: libcuda.so.1; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally �[32musing seed 1�[0m 2018-05-31 09:18:27.844271 UTC | Setting seed to 1 �[32musing seed 1�[0m /opt/conda/envs/rllab3/lib/python3.5/site-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module. "downsample module has been moved to the theano.tensor.signal.pool module.") Traceback (most recent call last): File "/root/code/rllab/scripts/run_experiment_lite.py", line 137, in <module> run_experiment(sys.argv) File "/root/code/rllab/scripts/run_experiment_lite.py", line 120, in run_experiment method_call = cloudpickle.loads(base64.b64decode(args.args_data)) File "/opt/conda/envs/rllab3/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 800, in _make_skel_func closure = _reconstruct_closure(closures) if closures else None File "/opt/conda/envs/rllab3/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 792, in _reconstruct_closure return tuple([_make_cell(v) for v in values]) TypeError: 'int' object is not iterable
Any help? If additional information is necessary, I am ready to provide it.
The text was updated successfully, but these errors were encountered:
I tried to run the
cluster_demo.py
on EC2. The instance starts fine but gets terminated shortly after. I get the following traceback in thestdout.log
sync initiated
log sync initiated
Running in docker
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:126] Couldn't open CUDA library libcuda.so.1. LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: 83526cf8e682
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: Permission denied: could not open driver version path for reading: /proc/driver/nvidia/version
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1065] LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1066] failed to find libcuda.so on this system: Failed precondition: could not dlopen DSO: libcuda.so.1; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
�[32musing seed 1�[0m
2018-05-31 09:18:27.844271 UTC | Setting seed to 1
�[32musing seed 1�[0m
/opt/conda/envs/rllab3/lib/python3.5/site-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.
"downsample module has been moved to the theano.tensor.signal.pool module.")
Traceback (most recent call last):
File "/root/code/rllab/scripts/run_experiment_lite.py", line 137, in <module>
run_experiment(sys.argv)
File "/root/code/rllab/scripts/run_experiment_lite.py", line 120, in run_experiment
method_call = cloudpickle.loads(base64.b64decode(args.args_data))
File "/opt/conda/envs/rllab3/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 800, in _make_skel_func
closure = _reconstruct_closure(closures) if closures else None
File "/opt/conda/envs/rllab3/lib/python3.5/site-packages/cloudpickle/cloudpickle.py", line 792, in _reconstruct_closure
return tuple([_make_cell(v) for v in values])
TypeError: 'int' object is not iterable
Any help? If additional information is necessary, I am ready to provide it.
The text was updated successfully, but these errors were encountered: