Skip to content
This repository has been archived by the owner on Nov 21, 2023. It is now read-only.

RuntimeError: CUDA error: no kernel image is available for execution on the device #965

Closed
cpoptic opened this issue Nov 26, 2019 · 1 comment

Comments

@cpoptic
Copy link

cpoptic commented Nov 26, 2019

PLEASE FOLLOW THESE INSTRUCTIONS BEFORE POSTING

  1. Please thoroughly read README.md, INSTALL.md, GETTING_STARTED.md, and FAQ.md
  2. Please search existing open and closed issues in case your issue has already been reported
  3. Please try to debug the issue in case you can solve it on your own before posting

After following steps 1-3 above and agreeing to provide the detailed information requested below, you may continue with posting your issue

(Delete this line and the text above it.)

Expected results

What did you expect to see?

Actual results

What did you observe instead?

Detailed steps to reproduce

E.g.:

trainer.train()

W1126 12:36:11.825089 140613154027328 checkpoint.py:214] 'roi_heads.box_predictor.cls_score.weight' has shape (81, 1024) in the checkpoint but (22, 1024) in the model! Skipped.
W1126 12:36:11.826922 140613154027328 checkpoint.py:214] 'roi_heads.box_predictor.cls_score.bias' has shape (81,) in the checkpoint but (22,) in the model! Skipped.
W1126 12:36:11.827631 140613154027328 checkpoint.py:214] 'roi_heads.box_predictor.bbox_pred.weight' has shape (320, 1024) in the checkpoint but (84, 1024) in the model! Skipped.
W1126 12:36:11.828277 140613154027328 checkpoint.py:214] 'roi_heads.box_predictor.bbox_pred.bias' has shape (320,) in the checkpoint but (84,) in the model! Skipped.
W1126 12:36:11.829076 140613154027328 checkpoint.py:214] 'roi_heads.mask_head.predictor.weight' has shape (80, 256, 1, 1) in the checkpoint but (21, 256, 1, 1) in the model! Skipped.
W1126 12:36:11.829612 140613154027328 checkpoint.py:214] 'roi_heads.mask_head.predictor.bias' has shape (80,) in the checkpoint but (21,) in the model! Skipped.

RuntimeError Traceback (most recent call last)
in
1 trainer = DefaultTrainer(cfg)
2 trainer.resume_or_load(resume=False)
----> 3 trainer.train()

~/repos/detectron2/detectron2/engine/defaults.py in train(self)
352 OrderedDict of results, if evaluation is enabled. Otherwise None.
353 """
--> 354 super().train(self.start_iter, self.max_iter)
355 if hasattr(self, "_last_eval_results") and comm.is_main_process():
356 verify_results(self.cfg, self._last_eval_results)

~/repos/detectron2/detectron2/engine/train_loop.py in train(self, start_iter, max_iter)
130 for self.iter in range(start_iter, max_iter):
131 self.before_step()
--> 132 self.run_step()
133 self.after_step()
134 finally:

~/repos/detectron2/detectron2/engine/train_loop.py in run_step(self)
210 If your want to do something with the losses, you can wrap the model.
211 """
--> 212 loss_dict = self.model(data)
213 losses = sum(loss for loss in loss_dict.values())
214 self._detect_anomaly(losses, loss_dict)

~/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
539 result = self._slow_forward(*input, **kwargs)
540 else:
--> 541 result = self.forward(*input, **kwargs)
542 for hook in self._forward_hooks.values():
543 hook_result = hook(self, input, result)

~/repos/detectron2/detectron2/modeling/meta_arch/rcnn.py in forward(self, batched_inputs)
80
81 if self.proposal_generator:
---> 82 proposals, proposal_losses = self.proposal_generator(images, features, gt_instances)
83 else:
84 assert "proposals" in batched_inputs[0]

~/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
539 result = self._slow_forward(*input, **kwargs)
540 else:
--> 541 result = self.forward(*input, **kwargs)
542 for hook in self._forward_hooks.values():
543 hook_result = hook(self, input, result)

~/repos/detectron2/detectron2/modeling/proposal_generator/rpn.py in forward(failed resolving arguments)
177 self.post_nms_topk[self.training],
178 self.min_box_side_len,
--> 179 self.training,
180 )
181 # For RPN-only models, the proposals are the final output and we return them in

~/repos/detectron2/detectron2/modeling/proposal_generator/rpn_outputs.py in find_top_rpn_proposals(proposals, pred_objectness_logits, images, nms_thresh, pre_nms_topk, post_nms_topk, min_box_side_len, training)
134 boxes, scores_per_img, lvl = boxes[keep], scores_per_img[keep], level_ids[keep]
135
--> 136 keep = batched_nms(boxes.tensor, scores_per_img, lvl, nms_thresh)
137 # In Detectron1, there was different behavior during training vs. testing.
138 # (#459)

~/repos/detectron2/detectron2/layers/nms.py in batched_nms(boxes, scores, idxs, iou_threshold)
15 # Investigate after having a fully-cuda NMS op.
16 if len(boxes) < 40000:
---> 17 return box_ops.batched_nms(boxes, scores, idxs, iou_threshold)
18
19 result_mask = scores.new_zeros(scores.size(), dtype=torch.bool)

~/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/ops/boxes.py in batched_nms(boxes, scores, idxs, iou_threshold)
70 offsets = idxs.to(boxes) * (max_coordinate + 1)
71 boxes_for_nms = boxes + offsets[:, None]
---> 72 keep = nms(boxes_for_nms, scores, iou_threshold)
73 return keep
74

~/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/ops/boxes.py in nms(boxes, scores, iou_threshold)
31 """
32 _C = _lazy_import()
---> 33 return _C.nms(boxes, scores, iou_threshold)
34
35

RuntimeError: CUDA error: no kernel image is available for execution on the device (nms_cuda at /tmp/pip-req-build-ekueqync/torchvision/csrc/cuda/nms_cuda.cu:127)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x6d (0x7fe26495ce7d in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: nms_cuda(at::Tensor const&, at::Tensor const&, float) + 0x8d1 (0x7fe230278ece in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so)
frame #2: nms(at::Tensor const&, at::Tensor const&, float) + 0x183 (0x7fe23023ced7 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so)
frame #3: + 0x79cf5 (0x7fe230256cf5 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so)
frame #4: + 0x765b0 (0x7fe2302535b0 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so)
frame #5: + 0x70d1e (0x7fe23024dd1e in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so)
frame #6: + 0x70fc2 (0x7fe23024dfc2 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so)
frame #7: + 0x5be4a (0x7fe230238e4a in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/lib/python3.7/site-packages/torchvision/_C.so)
frame #8: _PyMethodDef_RawFastCallKeywords + 0x264 (0x5647d51a1c34 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #9: _PyCFunction_FastCallKeywords + 0x21 (0x5647d51a1d51 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #10: _PyEval_EvalFrameDefault + 0x4ebc (0x5647d520e0ac in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #11: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #12: _PyEval_EvalFrameDefault + 0x416 (0x5647d5209606 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #13: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #14: _PyEval_EvalFrameDefault + 0x4b29 (0x5647d520dd19 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #15: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #16: _PyEval_EvalFrameDefault + 0x416 (0x5647d5209606 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #17: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #18: _PyEval_EvalFrameDefault + 0x416 (0x5647d5209606 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #19: _PyEval_EvalCodeWithName + 0xab8 (0x5647d5151978 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #20: _PyFunction_FastCallDict + 0x1d5 (0x5647d51522a5 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #21: _PyObject_Call_Prepend + 0x63 (0x5647d5170e33 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #22: PyObject_Call + 0x6e (0x5647d5163a3e in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #23: _PyEval_EvalFrameDefault + 0x1f3a (0x5647d520b12a in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #24: _PyEval_EvalCodeWithName + 0x2f9 (0x5647d51511b9 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #25: _PyFunction_FastCallDict + 0x1d5 (0x5647d51522a5 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #26: _PyObject_Call_Prepend + 0x63 (0x5647d5170e33 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #27: + 0x16a2da (0x5647d51a82da in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #28: _PyObject_FastCallKeywords + 0x49b (0x5647d51a919b in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #29: _PyEval_EvalFrameDefault + 0x52e6 (0x5647d520e4d6 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #30: _PyEval_EvalCodeWithName + 0xab8 (0x5647d5151978 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #31: _PyFunction_FastCallDict + 0x1d5 (0x5647d51522a5 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #32: _PyObject_Call_Prepend + 0x63 (0x5647d5170e33 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #33: PyObject_Call + 0x6e (0x5647d5163a3e in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #34: _PyEval_EvalFrameDefault + 0x1f3a (0x5647d520b12a in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #35: _PyEval_EvalCodeWithName + 0x2f9 (0x5647d51511b9 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #36: _PyFunction_FastCallDict + 0x1d5 (0x5647d51522a5 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #37: _PyObject_Call_Prepend + 0x63 (0x5647d5170e33 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #38: + 0x16a2da (0x5647d51a82da in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #39: _PyObject_FastCallKeywords + 0x49b (0x5647d51a919b in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #40: _PyEval_EvalFrameDefault + 0x52e6 (0x5647d520e4d6 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #41: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #42: _PyEval_EvalFrameDefault + 0x6a3 (0x5647d5209893 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #43: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #44: _PyEval_EvalFrameDefault + 0x4b29 (0x5647d520dd19 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #45: _PyEval_EvalCodeWithName + 0x5da (0x5647d515149a in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #46: _PyFunction_FastCallKeywords + 0x387 (0x5647d51a1437 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #47: _PyEval_EvalFrameDefault + 0x6a3 (0x5647d5209893 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #48: _PyEval_EvalCodeWithName + 0x2f9 (0x5647d51511b9 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #49: PyEval_EvalCodeEx + 0x44 (0x5647d5152094 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #50: PyEval_EvalCode + 0x1c (0x5647d51520bc in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #51: + 0x1daeb0 (0x5647d5218eb0 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #52: _PyMethodDef_RawFastCallKeywords + 0xe9 (0x5647d51a1ab9 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #53: _PyCFunction_FastCallKeywords + 0x21 (0x5647d51a1d51 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #54: _PyEval_EvalFrameDefault + 0x4784 (0x5647d520d974 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #55: _PyGen_Send + 0x2a2 (0x5647d51a9e32 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #56: _PyEval_EvalFrameDefault + 0x1a88 (0x5647d520ac78 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #57: _PyGen_Send + 0x2a2 (0x5647d51a9e32 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #58: _PyEval_EvalFrameDefault + 0x1a88 (0x5647d520ac78 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #59: _PyGen_Send + 0x2a2 (0x5647d51a9e32 in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #60: _PyMethodDef_RawFastCallKeywords + 0x8d (0x5647d51a1a5d in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #61: _PyMethodDescr_FastCallKeywords + 0x4f (0x5647d51a8c6f in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #62: _PyEval_EvalFrameDefault + 0x4c7b (0x5647d520de6b in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)
frame #63: _PyFunction_FastCallKeywords + 0xfb (0x5647d51a11ab in /home/paperspace/anaconda3/envs/detectron2_cuda_9_2/bin/python)

System information

  • Operating system: ?
  • Compiler version: ?
  • CUDA version: 10.1 is shown on nvidia-smi but conda list | grep cuda shows cudatoolkit=9.2
  • cuDNN version: 7.6.4
  • NVIDIA driver version: 430.50
  • GPU models (for all devices if they are not all the same): V100
  • PYTHONPATH environment variable: ?
  • python --version output: 3.7.5
  • Anything else that seems relevant: ?

RUnning on a Conda environment with Detectron2 installed
I downgraded from CUDA 10.1 to CUDA 9.2 to fix an earlier bug involved in

no kernel image is available for execution on the device

!nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.50 Driver Version: 430.50 CUDA Version: 10.1

gcc --version
gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

@ppwwyyxx
Copy link
Contributor

Detectron and detectron2 are two different projects.
Your error is described in https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md#common-installation-issues. If this does not solve the problem, please include details about the problem following detectron2's issue template.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants