I meet problem during implement light_head_rcnn #45

hewumars · 2018-05-06T08:33:37Z

loss_bbox is not converge.other loss(loss_cls,loss_rpn_cls,loss_bbox) is converge.can I push the code to you for debug.

Rizhiy · 2018-05-07T12:24:35Z

Hi, where did you take PSRoIPool layer from? A lot of PyTorch implementations of that layer are bugged. Also, they are probably implemented for single image batch and might not work with multiple images per batch.

hewumars · 2018-05-07T13:29:57Z

PSRoI_Align from https://github.com/zengarden/light_head_rcnn
PSRoIPooling from https://github.com/PureDiors/pytorch_RFCN
I set batchsize=1 when trianing light_head_rcnn. the codes seem to be able to work with multiple images per batch,but at least single image per batch can work.

Rizhiy · 2018-05-07T14:58:36Z

I'm pretty sure PSRoIPooling in that repo is bugged, see: xiong-zhitong/pytorch_RFCN#4.

hewumars · 2018-05-07T15:14:53Z

light head rcnn model also is not converge use PSRoI_Align from https://github.com/zengarden/light_head_rcnn ,I pull requests:#48

hewumars · 2018-05-07T15:17:40Z

I will carefully check the code

hewumars · 2018-05-08T01:57:42Z

@Rizhiy could you share PSRoIPooling ? I compare the code with https://github.com/msracver/Deformable-ConvNets/blob/master/rfcn/operator_cxx/psroi_pooling.cu,the different as shown:

Rizhiy · 2018-05-08T20:26:14Z

@hewumars I haven't yet got PSRoIPooling to work in PyTorch either.

YanShuo1992 · 2018-07-17T07:26:54Z

@Rizhiy How is the PSROI pooling going? I have seen you in many different repos. I think we both focus on the light-head rcnn, right? I don't get the PSRoIpooling in Pytorch either. I think it could be easier to use the code from the official tf implementation.

Rizhiy · 2018-07-17T12:11:22Z

@YanShuo1992 I'm currently using roytseng-tw/Detectron.pytorch, so far I have focused on getting the best mAP, so didn't put much work in light-head. I will try to let you know if I get something working.

YanShuo1992 · 2018-07-19T06:26:52Z

@hewumars @Rizhiy
I checked @hewumars 's light head rcnn code. I might find something wrong. I use the PSROIpooling after the res5 or stage5 in resnet50, right? But the RPN is still after the stage4. What do you think？

Rizhiy · 2018-07-19T18:42:56Z

That's not entirely correct. You need to pass output of res5, through a layer which has k*k*n filters, where k is pooling size and n is arbitrary number of layers (10 in the paper). Then you apply psroipool on that.

I suggest you check https://github.com/msracver/Deformable-ConvNets/blob/f4e163719c8e63cfad7af1caaaab93d373750393/rfcn/symbols/resnet_v1_101_rfcn.py#L785-L798 for reference.

YanShuo1992 · 2018-07-20T00:49:35Z

@Rizhiy
I will check the official rfcn to see how the rpn and large conv orignized.
@roytseng-tw
I am trying to implement the light rcnn based on your code. I tried a code from @hewumars and I get
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/generic/THCStorage.cu:58

So that I check the .cu code of psroipooling. I find you commit that do not use rounding in the roialign_kernel.cu. Can you tell me the reason for that or what problem it will lead?

GYxiaOH · 2018-08-22T09:52:28Z

@YanShuo1992
are you meet out of memory after some iterations? i meet same question , i compare psroi code with caffe2 and can't find some things.but i barely use CUDA coding so......
do you solve the problem?

YanShuo1992 · 2018-08-23T00:43:57Z

@GYxiaOH
Yes. I meet the out of memory when using psroi. I also check the caffe2 code or the tensorflow code and I find nothing. For now, I just give up the psroi and use alignroi.

In __init__() save self.num_data for use in __len__()

elnazavr pushed a commit to elnazavr/Detectron.pytorch that referenced this issue Apr 3, 2019

Merge pull request roytseng-tw#45 from cclauss/patch-2

8365c4e

In __init__() save self.num_data for use in __len__()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I meet problem during implement light_head_rcnn #45

I meet problem during implement light_head_rcnn #45

hewumars commented May 6, 2018

Rizhiy commented May 7, 2018 •

edited

Loading

hewumars commented May 7, 2018

Rizhiy commented May 7, 2018

hewumars commented May 7, 2018

hewumars commented May 7, 2018

hewumars commented May 8, 2018

Rizhiy commented May 8, 2018

YanShuo1992 commented Jul 17, 2018

Rizhiy commented Jul 17, 2018 •

edited

Loading

YanShuo1992 commented Jul 19, 2018

Rizhiy commented Jul 19, 2018 •

edited

Loading

YanShuo1992 commented Jul 20, 2018

GYxiaOH commented Aug 22, 2018

YanShuo1992 commented Aug 23, 2018

I meet problem during implement light_head_rcnn #45

I meet problem during implement light_head_rcnn #45

Comments

hewumars commented May 6, 2018

Rizhiy commented May 7, 2018 • edited Loading

hewumars commented May 7, 2018

Rizhiy commented May 7, 2018

hewumars commented May 7, 2018

hewumars commented May 7, 2018

hewumars commented May 8, 2018

Rizhiy commented May 8, 2018

YanShuo1992 commented Jul 17, 2018

Rizhiy commented Jul 17, 2018 • edited Loading

YanShuo1992 commented Jul 19, 2018

Rizhiy commented Jul 19, 2018 • edited Loading

YanShuo1992 commented Jul 20, 2018

GYxiaOH commented Aug 22, 2018

YanShuo1992 commented Aug 23, 2018

Rizhiy commented May 7, 2018 •

edited

Loading

Rizhiy commented Jul 17, 2018 •

edited

Loading

Rizhiy commented Jul 19, 2018 •

edited

Loading