CUDA memory usage continuously increases #77

vlfom · 2022-01-25T03:19:40Z

Dear authors,

Thank you for the great work and clean code.

I am using the CenterNet2 default configuration (from Base-CenterNet2.yaml), however, when training, I observe that the memory reserved by CUDA keeps increasing until the training fails due to CUDA OOM error. When I replace the CenterNet2 with the default RPN, the issue disappears.

I tried adding gc.collect() and torch.cuda.empty_cache() to the training loop with no success.

Have you noticed such behavior in the past, or could you please provide some hints on what could be the issue? Below I also provide some reference screenshots.

Note: in my project, there are several things that differ from the abovementioned configuration: I train on 50% of COCO dataset and I use LazyConfig to initialize the model. However, I reimplemented the configuration twice and both face the same issue, so it is unlikely there is a bug in my code.

(observe that memory allocation keeps increasing on both images)

The text was updated successfully, but these errors were encountered:

costapt · 2022-01-28T19:22:42Z

Hi!

I am facing the same issue. I tried replacing the CustomCascadeROIHeads with the StandardROIHeads, trying to confirm if the problem, but the same problem persists. I have the feeling that the problem is in CenterNet, but I still was not not able to pinpoint where.

kachiO · 2022-01-29T00:34:15Z

I've encountered this issue as well. It seems to happen with the two-stage CenterNet2 models. The workaround that I've found is running the model with the following versions: detectron2=v0.6, pytorch=1.8.1, python=3.6, and cuda=11.1

costapt · 2022-01-29T11:52:55Z

Thank you! 👍 It seemed to have solved the problem here as well!

vlfom changed the title ~~CUDA memory continuously increasing~~ CUDA memory usage continuously increases Jan 25, 2022

xingyizhou mentioned this issue Mar 30, 2022

Training memory issue & missing file xingyizhou/GTR#2

Closed

Z-MU-Z mentioned this issue Jul 28, 2023

Memory leak in CenterNet? yoctta/XPaste#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA memory usage continuously increases #77

CUDA memory usage continuously increases #77

vlfom commented Jan 25, 2022 •

edited

Loading

costapt commented Jan 28, 2022

kachiO commented Jan 29, 2022

costapt commented Jan 29, 2022

CUDA memory usage continuously increases #77

CUDA memory usage continuously increases #77

Comments

vlfom commented Jan 25, 2022 • edited Loading

costapt commented Jan 28, 2022

kachiO commented Jan 29, 2022

costapt commented Jan 29, 2022

vlfom commented Jan 25, 2022 •

edited

Loading