Skip to content

Latest commit

 

History

History
73 lines (54 loc) · 6.58 KB

MODEL_ZOO.md

File metadata and controls

73 lines (54 loc) · 6.58 KB

MODEL_ZOO

Common settings and notes

  • Multiscale training is used by default in all models. The results are all reported using single-scale testing.
  • We report runtime on our local workstation with a TitanXp GPU and a Titan RTX GPU.
  • All models are trained on 8-GPU servers by default. The 1280 models are trained on 24G GPUs. Reducing the batchsize with the linear learning rate rule should be fine.
  • All models can be downloaded directly from Google drive.

COCO

CenterNet

Model val mAP FPS (Titan Xp/ Titan RTX) links
CenterNet-S4_DLA_8x 42.5 50 / 71 config/model
CenterNet-FPN_R50_1x 40.2 20 / 24 config/model

Note

  • CenterNet-S4_DLA_8x is a re-implemented version of the original CenterNet (stride 4), with several changes, including
    • Using top-left-right-bottom box encoding and GIoU Loss; adding regression loss to the center 3x3 region.
    • Adding more positive pixels for the heatmap loss whose regression loss is small and is within the center3x3 region.
    • Using more heavy crop augmentation (EfficientDet-style crop ratio 0.1-2), and removing color augmentations.
    • Using standard NMS instead of max pooling.
    • Using RetinaNet-style optimizer (SGD), learning rate rule (0.01 for each batch size 16), and schedule (8x12 epochs).
  • CenterNet-FPN_R50_1x is a (new) FPN version of CenterNet. It includes the changes above, and assigns objects to FPN levels based on a fixed size range. The model is trained with standard short edge 640-800 multi-scale training with 12 epochs (1x).

CenterNet2

Model val mAP FPS (Titan Xp/ Titan RTX) links
CenterNet2-F_R50_1x 41.7 22 / 27 config/model
CenterNet2_R50_1x 42.9 18 / 24 config/model
CenterNet2_X101-DCN_2x 49.9 6 / 8 config/model
CenterNet2_DLA-BiFPN-P3_4x 43.8 40 / 50 config/model
CenterNet2_DLA-BiFPN-P3_24x 45.6 40 / 50 config/model
CenterNet2_R2-101-DCN_896_4x 51.2 9 / 13 config/model
CenterNet2_R2-101-DCN-BiFPN_1280_4x 52.9 6 / 8 config/model
CenterNet2_R2-101-DCN-BiFPN_4x+4x_1560_ST 56.1 3 / 5 config/model
CenterNet2_DLA-BiFPN-P5_640_24x_ST 49.2 33 / 38 config/model

Note

  • CenterNet2-F_R50_1x uses Faster RCNN as the second stage. All other CenterNet2 models use Cascade RCNN as the second stage.
  • CenterNet2_DLA-BiFPN-P3_4x follows the same training setting as realtime-FCOS.
  • CenterNet2_DLA-BiFPN-P3_24x is trained by repeating the 4x schedule (starting from learning rate 0.01) 6 times.
  • R2 means Res2Net backbone. To train Res2Net models, you need to download the ImageNet pre-trained weight here and place it in output/r2_101.pkl.
  • The last 4 models in the table are trained with the EfficientDet-style resize-and-crop augmentation, instead of the default random resizing short edge in detectron2. We found this trains faster (per-iteration) and gives better performance under a long schedule.
  • _ST means using self-training using pseudo-labels produced by Scaled-YOLOv4 on COCO unlabeled images, with a hard score threshold 0.5. Our processed pseudo-labels can be downloaded here.
  • CenterNet2_R2-101-DCN-BiFPN_4x+4x_1560_ST finetunes from CenterNet2_R2-101-DCN-BiFPN_1280_4x for an additional 4x schedule with the self-training data. It is trained under 1280x1280 but tested under 1560x1560.

LVIS v1

Model val mAP box links
LVIS_CenterNet2_R50_1x 26.5 config/model
LVIS_CenterNet2_R50_Fed_1x 28.3 config/model
  • The models are trained with repeat-factor sampling.
  • LVIS_CenterNet2_R50_Fed_1x is CenterNet2 with our federated loss. Check our Appendix D of our paper or our technical report at LVIS challenge for references.

Objects365

Model val mAP links
O365_CenterNet2_R50_1x 22.6 config/model

Note

  • Objects365 dataset can be downloaded here.
  • The model is trained with class-aware sampling.