new rcnn code, gpu memory is increasing and gpu utility is very low, what is the reason? #11980

hdjsjyl · 2018-08-01T22:15:54Z

hdjsjyl
Aug 1, 2018

Thanks for your better work.
Compared with old rcnn version, new version code occupies more memory and it keeps increasing. For me, it is very strange. Do you know what is the reason?

ijkguo · 2018-08-02T18:48:30Z

ijkguo
Aug 2, 2018

Are you comparing memory footprint with the same MXNet build?

0 replies

hdjsjyl · 2018-08-02T18:50:26Z

hdjsjyl
Aug 2, 2018
Author

@ijkguo yes, I use the same setting other than rcnn framework.

0 replies

ijkguo · 2018-08-02T19:24:22Z

ijkguo
Aug 2, 2018

Occupy more memory (initially): because the maximum reserved shape changed from (1, 3, 600, 1000) to (1, 3, 1000, 1000).

Keeps increasing: because MutableModule is replaced by Module which supports varying input shape. The behavior of Module is to allocate more and more memory until we can't, but it is safe. MutableModule shared the initially reserved memory, hence not increasing.

Lower GPU utilization: because all Cython speedup is removed.

Changing to the new version makes the example much easier to setup, to use or to maintain with the price of slower speed. However, research implementations based on the original example is available elsewhere, for example https://github.com/msracver/Deformable-ConvNets. For a historical mind, the old version is available at https://github.com/ijkguo/mx-rcnn/tree/v5.1.

0 replies

Roshrini · 2018-08-02T21:06:02Z

Roshrini
Aug 2, 2018
Collaborator

@sandeep-krishnamurthy Can you please add labels: Memory, Question

0 replies

hdjsjyl · 2018-08-02T21:29:16Z

hdjsjyl
Aug 2, 2018
Author

Thanks for your complete explanation. I am using Faster-RCNN to do face detection. I need to add some tricks and change the network structure. So if the basic Faster-RCNN occupies much memory, it is not good for me. So I think the old one is more suitable for me. But the new Faster-RCNN code is easier to read and modify. Thanks for your excellent work.

…

On Thu, Aug 2, 2018 at 2:25 PM, Jian Guo ***@***.***> wrote: Occupy more memory (initially): because the maximum reserved shape changed from (1, 3, 600, 1000) to (1, 3, 1000, 1000). Keeps increasing: because MutableModule is replaced by Module which supports varying input shape. The behavior of Module is to allocate more and more memory until we can't, but it is safe. MutableModule shared the initially reserved memory, hence not increasing. Lower GPU utilization: because all Cython speedup is removed. Changing to the new version makes the example much easier to setup, to use or to maintain with the price of slower speed. However, research implementations based on the original example is available elsewhere, for example https://github.com/msracver/Deformable-ConvNets. For a historical mind, the old version is available at https://github.com/ijkguo/mx- rcnn/tree/v5.1. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/apache/incubator-mxnet/issues/11980#issuecomment-410039815>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHPQ-64j9naSiy3-LBvKQCmziLVH_ptEks5uM1IkgaJpZM4VrVLs> .

-- Best wishes! Lei Shi

0 replies

hdjsjyl · 2018-08-02T21:40:41Z

hdjsjyl
Aug 2, 2018
Author

@ijkguo And I found another problem of old rcnn code. Sometimes, when I resume and continue to train the network, the program will be still, no gpu utility, no output, and occupied gpu memory is unchanged. And I only can stop and restart it. I check the code multiple times and modify some code. The problem still exists. It is strange for me. So I am always looking forward to new rcnn version code. Thanks.

0 replies

ijkguo · 2018-08-02T22:12:19Z

ijkguo
Aug 2, 2018

This may not be a problem of rcnn alone. My observation is that program hang are usually related to nvidia driver.

0 replies

hdjsjyl · 2018-08-02T22:14:21Z

hdjsjyl
Aug 2, 2018
Author

That is a good point for me. Thank you very much.

On Thu, Aug 2, 2018 at 5:12 PM Jian Guo ***@***.***> wrote: This may not be a problem of rcnn alone. My observation is that program hang are usually related to nvidia driver. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/apache/incubator-mxnet/issues/11980#issuecomment-410084158>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHPQ-0xkLBEJm9hVkF-YqhzeSuJvEc15ks5uM3lngaJpZM4VrVLs> .

-- Best wishes! Lei Shi

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

new rcnn code, gpu memory is increasing and gpu utility is very low, what is the reason? #11980

{{title}}

Replies: 8 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

new rcnn code, gpu memory is increasing and gpu utility is very low, what is the reason? #11980

hdjsjyl Aug 1, 2018

Replies: 8 comments

ijkguo Aug 2, 2018

hdjsjyl Aug 2, 2018 Author

ijkguo Aug 2, 2018

Roshrini Aug 2, 2018 Collaborator

hdjsjyl Aug 2, 2018 Author

hdjsjyl Aug 2, 2018 Author

ijkguo Aug 2, 2018

hdjsjyl Aug 2, 2018 Author

hdjsjyl
Aug 1, 2018

ijkguo
Aug 2, 2018

hdjsjyl
Aug 2, 2018
Author

ijkguo
Aug 2, 2018

Roshrini
Aug 2, 2018
Collaborator

hdjsjyl
Aug 2, 2018
Author

hdjsjyl
Aug 2, 2018
Author

ijkguo
Aug 2, 2018

hdjsjyl
Aug 2, 2018
Author