Error while running infer_simple.py for the first time #812

Gigibulid · 2019-01-30T15:44:22Z

Even though I have found different solutions for this problem nothing seems to work for me...

Expected results

A pdf file with the visualizations of the detections

Actual results

Traceback (most recent call last):
File "tools/infer_simple.py", line 185, in
main(args)
File "tools/infer_simple.py", line 153, in main
model, im, None, timers=timers
File "/home/gigi/detectron/detectron/core/test.py", line 66, in im_detect_all
model, im, cfg.TEST.SCALE, cfg.TEST.MAX_SIZE, boxes=box_proposals
File "/home/gigi/detectron/detectron/core/test.py", line 158, in im_detect_bbox
workspace.RunNet(model.net.Proto().name)
File "/home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/workspace.py", line 236, in RunNet
StringifyNetName(name), num_iter, allow_fail,
File "/home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/workspace.py", line 197, in CallWithExceptionIntercept
return func(args, kwargs)
RuntimeError: [enforce fail at context_gpu.cu:415] error == cudaSuccess. 2 vs 0. Error at: /opt/conda/conda-bld/pytorch_1544194558701/work/caffe2/core/context_gpu.cu:415: out of memory
Error from operator:
input: "gpu_0/res3_0_branch2a" input: "gpu_0/res3_0_branch2b_w" output: "gpu_0/res3_0_branch2b" name: "" type: "Conv" arg { name: "kernel" i: 3 } arg { name: "exhaustive_search" i: 0 } arg { name: "stride" i: 1 } arg { name: "pad" i: 1 } arg { name: "order" s: "NCHW" } arg { name: "dilation" i: 1 } device_option { device_type: 1 device_id: 0 } engine: "CUDNN"frame #0: c10::ThrowEnforceNotMet(char const, int, char const, std::string const&, void const) + 0x59 (0x7fb7bdacb309 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libc10.so)
frame #1: + 0x2a6945c (0x7fb7c074545c in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #2: + 0x13c7fd5 (0x7fb7bf0a3fd5 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #3: + 0x157a294 (0x7fb7bf256294 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #4: bool caffe2::CudnnConvOp::DoRunWithType<float, float, float, float>() + 0x3d9 (0x7fb7bf2648a9 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #5: caffe2::CudnnConvOp::RunOnDevice() + 0x1b0 (0x7fb7bf24e060 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #6: + 0x14d0955 (0x7fb7bf1ac955 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #7: caffe2::AsyncNetBase::run(int, int) + 0x144 (0x7fb7fcd3d324 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #8: + 0x118a6c2 (0x7fb7fcd446c2 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #9: c10::ThreadPool::main_loop(unsigned long) + 0x258 (0x7fb7fc0847e8 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #10: + 0xb8678 (0x7fb810285678 in /home/gigi/anaconda2/envs/caffe227/lib/python2.7/site-packages/../../libstdc++.so.6)
frame #11: + 0x8184 (0x7fb8189ca184 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #12: clone + 0x6d (0x7fb817fea03d in /lib/x86_64-linux-gnu/libc.so.6)

Detailed steps to reproduce

I ran the following command from: https://github.com/facebookresearch/Detectron/blob/master/GETTING_STARTED.md

python tools/infer_simple.py --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml --output-dir /tmp/detectron-visualizations --image-ext jpg --wts https://dl.fbaipublicfiles.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl demo

System information

Operating system: UBUNTU 14.04
gcc --version : gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
CUDA version: 10.0
cuDNN version: 7.4.2
NVIDIA driver version: 410.48
GPU model: GeForce GTX 1050
PYTHONPATH environment variable: echo $PYTHONPATH outputs nothing
python --version output: Python 2.7.15 :: Anaconda custom (64-bit)

Other info

When I run: python /tests/test_spatial_narrow_as_op.py
I get:
[E init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
Found Detectron ops lib: /home/afroditi/anaconda2/envs/caffe227/lib/python2.7/site-packages/torch/lib/libcaffe2_detectron_ops_gpu.so
...

Ran 3 tests in 2.810s

OK

The text was updated successfully, but these errors were encountered:

jungaria · 2019-03-29T01:45:40Z

I have the same problems, so i cannot go any further with detectron.
I also need any comments.

Thanks

jungaria · 2019-03-29T02:26:59Z

@Gigibulid

Hey, try to change the values as below

When i ran infer_simple.py with options, --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml --output-dir ./test/imageDetection --image-ext jpg --wts trainedWeights/model_final.pkl demo, i got "out_of memor" error.

i changed SCALES/SCALE values in e2e_mask_rcnn_R-101-FPN_2x.yaml from 800 to 300 ( actually 700 made same error ) and then it worked.

pratikbhave2 · 2019-09-12T19:08:09Z

Hi @Gigibulid I was facing the same error and @jungaria 's solution worked.
Thanks! @jungaria

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while running infer_simple.py for the first time #812

Error while running infer_simple.py for the first time #812

Gigibulid commented Jan 30, 2019

jungaria commented Mar 29, 2019

jungaria commented Mar 29, 2019

pratikbhave2 commented Sep 12, 2019

Error while running infer_simple.py for the first time #812

Error while running infer_simple.py for the first time #812

Comments

Gigibulid commented Jan 30, 2019

Expected results

Actual results

Detailed steps to reproduce

System information

Other info

jungaria commented Mar 29, 2019

jungaria commented Mar 29, 2019

pratikbhave2 commented Sep 12, 2019