MODEL_ZOO

Common settings and notes

Multiscale training is used by default in all models. The results are all reported using single-scale testing.
We report runtime on our local workstation with a TitanXp GPU and a Titan RTX GPU.
All models are trained on 8-GPU servers by default. The 1280 models are trained on 24G GPUs. Reducing the batchsize with the linear learning rate rule should be fine.
All models can be downloaded directly from Google drive.

COCO

CenterNet

Model	val mAP	FPS (Titan Xp/ Titan RTX)	links
CenterNet-S4_DLA_8x	42.5	50 / 71	config/model
CenterNet-FPN_R50_1x	40.2	20 / 24	config/model

Note

CenterNet-S4_DLA_8x is a re-implemented version of the original CenterNet (stride 4), with several changes, including
- Using top-left-right-bottom box encoding and GIoU Loss; adding regression loss to the center 3x3 region.
- Adding more positive pixels for the heatmap loss whose regression loss is small and is within the center3x3 region.
- Using more heavy crop augmentation (EfficientDet-style crop ratio 0.1-2), and removing color augmentations.
- Using standard NMS instead of max pooling.
- Using RetinaNet-style optimizer (SGD), learning rate rule (0.01 for each batch size 16), and schedule (8x12 epochs).
CenterNet-FPN_R50_1x is a (new) FPN version of CenterNet. It includes the changes above, and assigns objects to FPN levels based on a fixed size range. The model is trained with standard short edge 640-800 multi-scale training with 12 epochs (1x).

CenterNet2

Model	val mAP	FPS (Titan Xp/ Titan RTX)	links
CenterNet2-F_R50_1x	41.7	22 / 27	config/model
CenterNet2_R50_1x	42.9	18 / 24	config/model
CenterNet2_X101-DCN_2x	49.9	6 / 8	config/model
CenterNet2_DLA-BiFPN-P3_4x	43.8	40 / 50	config/model
CenterNet2_DLA-BiFPN-P3_24x	45.6	40 / 50	config/model
CenterNet2_R2-101-DCN_896_4x	51.2	9 / 13	config/model
CenterNet2_R2-101-DCN-BiFPN_1280_4x	52.9	6 / 8	config/model
CenterNet2_R2-101-DCN-BiFPN_4x+4x_1560_ST	56.1	3 / 5	config/model
CenterNet2_DLA-BiFPN-P5_640_24x_ST	49.2	33 / 38	config/model

Note

CenterNet2-F_R50_1x uses Faster RCNN as the second stage. All other CenterNet2 models use Cascade RCNN as the second stage.
CenterNet2_DLA-BiFPN-P3_4x follows the same training setting as realtime-FCOS.
CenterNet2_DLA-BiFPN-P3_24x is trained by repeating the 4x schedule (starting from learning rate 0.01) 6 times.
R2 means Res2Net backbone. To train Res2Net models, you need to download the ImageNet pre-trained weight here and place it in output/r2_101.pkl.
The last 4 models in the table are trained with the EfficientDet-style resize-and-crop augmentation, instead of the default random resizing short edge in detectron2. We found this trains faster (per-iteration) and gives better performance under a long schedule.
_ST means using self-training using pseudo-labels produced by Scaled-YOLOv4 on COCO unlabeled images, with a hard score threshold 0.5. Our processed pseudo-labels can be downloaded here.
CenterNet2_R2-101-DCN-BiFPN_4x+4x_1560_ST finetunes from CenterNet2_R2-101-DCN-BiFPN_1280_4x for an additional 4x schedule with the self-training data. It is trained under 1280x1280 but tested under 1560x1560.

LVIS v1

Model	val mAP box	links
LVIS_CenterNet2_R50_1x	26.5	config/model
LVIS_CenterNet2_R50_Fed_1x	28.3	config/model

The models are trained with repeat-factor sampling.
LVIS_CenterNet2_R50_Fed_1x is CenterNet2 with our federated loss. Check our Appendix D of our paper or our technical report at LVIS challenge for references.

Objects365

Model	val mAP	links
O365_CenterNet2_R50_1x	22.6	config/model

Note

Objects365 dataset can be downloaded here.
The model is trained with class-aware sampling.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MODEL_ZOO.md

MODEL_ZOO.md

MODEL_ZOO

Common settings and notes

COCO

CenterNet

Note

CenterNet2

Note

LVIS v1

Objects365

Note

Files

MODEL_ZOO.md

Latest commit

History

MODEL_ZOO.md

File metadata and controls

MODEL_ZOO

Common settings and notes

COCO

CenterNet

Note

CenterNet2

Note

LVIS v1

Objects365

Note