Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v_shaped = torch.matmul(shapedirs, beta).view(-1, 6890, 3) + v_template RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle) #56

Open
Ngheissari opened this issue Jan 29, 2022 · 0 comments

Comments

@Ngheissari
Copy link

While running

python -m torch.distributed.launch --nproc_per_node=8 metro/tools/run_metro_bodymesh.py for training:

Traceback (most recent call last):
File "metro/tools/run_metro_bodymesh.py", line 717, in
main(args)
File "metro/tools/run_metro_bodymesh.py", line 711, in main
run(args, train_dataloader, val_dataloader, _metro_network, smpl, mesh_sampler, renderer)
File "metro/tools/run_metro_bodymesh.py", line 235, in run
pred_camera, pred_3d_joints, pred_vertices_sub2, pred_vertices_sub, pred_vertices = METRO_model(images, smpl, mesh_sampler, meta_masks=meta_masks, is_train=True)
File "....../miniconda3/envs/metro/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(input, **kwargs)
File "...../MeshTransformer/metro/modeling/bert/modeling_metro.py", line 280, in forward
features = features
meta_masks + constant_tensor*(1-meta_masks)
File "....../miniconda3/envs/metro/lib/python3.7/site-packages/torch/tensor.py", line 394, in rsub
return _C._VariableFunctions.rsub(self, other)
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.78 GiB total capacity; 1.52 GiB already allocated; 9.00 MiB free; 1.55 GiB reserved in total by PyTorch)
Traceback (most recent call last):
File "metro/tools/run_metro_bodymesh.py", line 717, in
main(args)
File "metro/tools/run_metro_bodymesh.py", line 711, in main
run(args, train_dataloader, val_dataloader, _metro_network, smpl, mesh_sampler, renderer)
File "metro/tools/run_metro_bodymesh.py", line 221, in run
gt_vertices_sub2 = mesh_sampler.downsample(gt_vertices, n1=0, n2=2)
File .....MeshTransformer/metro/modeling/_smpl.py", line 272, in downsample
y = spmm(self._D[j], y)
File ".....MeshTransformer/metro/modeling/_smpl.py", line 172, in spmm
return SparseMM.apply(sparse, dense)
File "......MeshTransformer/metro/modeling/_smpl.py", line 161, in forward
return torch.matmul(sparse, dense)
RuntimeError: CUDA error: initialization error when calling cusparseCreate(handle)
Traceback (most recent call last):
File "metro/tools/run_metro_bodymesh.py", line 717, in
main(args)
File "metro/tools/run_metro_bodymesh.py", line 711, in main
run(args, train_dataloader, val_dataloader, _metro_network, smpl, mesh_sampler, renderer)
File "metro/tools/run_metro_bodymesh.py", line 220, in run
gt_vertices = smpl(gt_pose, gt_betas)
File "....../miniconda3/envs/metro/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File ".....MeshTransformer/metro/modeling/_smpl.py", line 89, in forward
v_shaped = torch.matmul(shapedirs, beta).view(-1, 6890, 3) + v_template
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)


python=3.7
pytorch==1.4.0
torchvision==0.5.0
cudatoolkit=10.1


Also despite having 64 GB memory and reducing batch-size to 2 , still I get :

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.78 GiB total capacity; 1.52 GiB already allocated; 9.00 MiB free; 1.55 GiB reserved in total by PyTorch)


torch.cuda.empty_cache() did not help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant