Skip to content
This repository has been archived by the owner on Nov 21, 2023. It is now read-only.

Error while running infer_simple.py for the first time #812

Open
Gigibulid opened this issue Jan 30, 2019 · 3 comments
Open

Error while running infer_simple.py for the first time #812

Gigibulid opened this issue Jan 30, 2019 · 3 comments

Comments

@Gigibulid
Copy link

Even though I have found different solutions for this problem nothing seems to work for me...

Expected results

A pdf file with the visualizations of the detections

Actual results

Traceback (most recent call last):
File "tools/infer_simple.py", line 185, in
main(args)
File "tools/infer_simple.py", line 153, in main
model, im, None, timers=timers
File "/home/gigi/detectron/detectron/core/test.py", line 66, in im_detect_all
model, im, cfg.TEST.SCALE, cfg.TEST.MAX_SIZE, boxes=box_proposals
File "/home/gigi/detectron/detectron/core/test.py", line 158, in im_detect_bbox
workspace.RunNet(model.net.Proto().name)
File "/home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/workspace.py", line 236, in RunNet
StringifyNetName(name), num_iter, allow_fail,
File "/home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/workspace.py", line 197, in CallWithExceptionIntercept
return func(args, kwargs)
RuntimeError: [enforce fail at context_gpu.cu:415] error == cudaSuccess. 2 vs 0. Error at: /opt/conda/conda-bld/pytorch_1544194558701/work/caffe2/core/context_gpu.cu:415: out of memory
Error from operator:
input: "gpu_0/res3_0_branch2a" input: "gpu_0/res3_0_branch2b_w" output: "gpu_0/res3_0_branch2b" name: "" type: "Conv" arg { name: "kernel" i: 3 } arg { name: "exhaustive_search" i: 0 } arg { name: "stride" i: 1 } arg { name: "pad" i: 1 } arg { name: "order" s: "NCHW" } arg { name: "dilation" i: 1 } device_option { device_type: 1 device_id: 0 } engine: "CUDNN"frame #0: c10::ThrowEnforceNotMet(char const
, int, char const
, std::string const&, void const
) + 0x59 (0x7fb7bdacb309 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libc10.so)
frame #1: + 0x2a6945c (0x7fb7c074545c in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #2: + 0x13c7fd5 (0x7fb7bf0a3fd5 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #3: + 0x157a294 (0x7fb7bf256294 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #4: bool caffe2::CudnnConvOp::DoRunWithType<float, float, float, float>() + 0x3d9 (0x7fb7bf2648a9 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #5: caffe2::CudnnConvOp::RunOnDevice() + 0x1b0 (0x7fb7bf24e060 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #6: + 0x14d0955 (0x7fb7bf1ac955 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #7: caffe2::AsyncNetBase::run(int, int) + 0x144 (0x7fb7fcd3d324 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #8: + 0x118a6c2 (0x7fb7fcd446c2 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #9: c10::ThreadPool::main_loop(unsigned long) + 0x258 (0x7fb7fc0847e8 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #10: + 0xb8678 (0x7fb810285678 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/../../libstdc++.so.6)
frame #11: + 0x8184 (0x7fb8189ca184 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #12: clone + 0x6d (0x7fb817fea03d in /lib/x86_64-linux-gnu/libc.so.6)

Detailed steps to reproduce

I ran the following command from: https://github.com/facebookresearch/Detectron/blob/master/GETTING_STARTED.md

python tools/infer_simple.py --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml --output-dir /tmp/detectron-visualizations --image-ext jpg --wts https://dl.fbaipublicfiles.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl demo

System information

  • Operating system: UBUNTU 14.04
  • gcc --version : gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
  • CUDA version: 10.0
  • cuDNN version: 7.4.2
  • NVIDIA driver version: 410.48
  • GPU model: GeForce GTX 1050
  • PYTHONPATH environment variable: echo $PYTHONPATH outputs nothing
  • python --version output: Python 2.7.15 :: Anaconda custom (64-bit)

Other info

When I run: python /tests/test_spatial_narrow_as_op.py
I get:
[E init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
Found Detectron ops lib: /home/afroditi/anaconda2/envs/caffe227/lib/python2.7/site-packages/torch/lib/libcaffe2_detectron_ops_gpu.so
...

Ran 3 tests in 2.810s

OK

@jungaria
Copy link

I have the same problems, so i cannot go any further with detectron.
I also need any comments.

Thanks

@jungaria
Copy link

@Gigibulid

Hey, try to change the values as below

When i ran infer_simple.py with options, --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml --output-dir ./test/imageDetection --image-ext jpg --wts trainedWeights/model_final.pkl demo, i got "out_of memor" error.

i changed SCALES/SCALE values in e2e_mask_rcnn_R-101-FPN_2x.yaml from 800 to 300 ( actually 700 made same error ) and then it worked.

@pratikbhave2
Copy link

Hi @Gigibulid I was facing the same error and @jungaria 's solution worked.
Thanks! @jungaria

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants