- SSD SSD: Single Shot Multibox Detector
- FSSD FSSD: Feature Fusion Single Shot Multibox Detector
- RFB-SSDReceptive Field Block Net for Accurate and Fast Object Detection
- RefineDetSingle-Shot Refinement Neural Network for Object Detection
System | mAP | FPS (Titan X Maxwell) |
---|---|---|
Faster R-CNN (VGG16) | 73.2 | 7 |
YOLOv2 (Darknet-19) | 78.6 | 40 |
R-FCN (ResNet-101) | 80.5 | 9 |
SSD300* (VGG16) | 77.2 | 46 |
SSD512* (VGG16) | 79.8 | 19 |
RFBNet300 (VGG16) | 80.5 | 83 |
RFBNet512 (VGG16) | 82.2 | 38 |
SSD300 (VGG) | 77.8 | 150 (1080Ti) |
FSSD300 (VGG) | 78.8 | 120 (1080Ti) |
System | test-dev mAP | Time (Titan X Maxwell) |
---|---|---|
Faster R-CNN++ (ResNet-101) | 34.9 | 3.36s |
YOLOv2 (Darknet-19) | 21.6 | 25ms |
SSD300* (VGG16) | 25.1 | 22ms |
SSD512* (VGG16) | 28.8 | 53ms |
RetinaNet500 (ResNet-101-FPN) | 34.4 | 90ms |
RFBNet300 (VGG16) | 29.9 | 15ms* |
RFBNet512 (VGG16) | 33.8 | 30ms* |
RFBNet512-E (VGG16) | 34.4 | 33ms* |
SSD512 (HarDNet68) | 31.7 | TBD (12.9ms**) |
SSD512 (HarDNet85) | 35.1 | TBD (15.9ms**) |
RFBNet512 (HarDNet68) | 33.9 | TBD (16.7ms**) |
RFBNet512 (HarDNet85) | 36.8 | TBD (19.3ms**) |
Note: * The speed here is tested on the newest pytorch and cudnn version (0.2.0 and cudnnV6), which is obviously faster than the speed reported in the paper (using pytorch-0.1.12 and cudnnV5).
Note: ** HarDNet results are measured on Titan V with pytorch 1.0.1 for detection only (NMS is NOT included, which is 13~18ms in general cases). For reference, the measurement of SSD-vgg on the same environment is 15.7ms (also detection only).
System | COCO minival mAP | #parameters |
---|---|---|
SSD MobileNet | 19.3 | 6.8M |
RFB MobileNet | 20.7* | 7.4M |
*: slightly better than the original ones in the paper (20.5).
- Install PyTorch-0.2.0-0.3.1 by selecting your environment on the website and running the appropriate command.
- Clone this repository. This repository is mainly based onRFBNet, ssd.pytorch and Chainer-ssd, a huge thank to them.
- Note: We currently only support Python 3+.
- Compile the nms and coco tools:
./make.sh
Note*: Check you GPU architecture support in utils/build.py, line 131. Default is:
'nvcc': ['-arch=sm_52',
- Install pyinn for MobileNet backbone:
pip install git+https://github.com/szagoruyko/pyinn.git@master
- Then download the dataset by following the instructions below and install opencv.
conda install opencv
Note: For training, we currently support VOC and COCO.
To make things easy, we provide simple VOC and COCO dataset loader that inherits torch.utils.data.Dataset
making it fully compatible with the torchvision.datasets
API.
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2007.sh # <directory>
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2012.sh # <directory>
Install the MS COCO dataset at /path/to/coco from official website, default is ~/data/COCO. Following the instructions to prepare minival2014 and valminusminival2014 annotations. All label files (.json) should be under the COCO/annotations/ folder. It should have this basic structure
$COCO/
$COCO/cache/
$COCO/annotations/
$COCO/images/
$COCO/images/test2015/
$COCO/images/train2014/
$COCO/images/val2014/
UPDATE: The current COCO dataset has released new train2017 and val2017 sets which are just new splits of the same image sets.
-
First download the fc-reduced VGG-16 PyTorch base network weights at: https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth or from our BaiduYun Driver
-
MobileNet pre-trained basenet is ported from MobileNet-Caffe, which achieves slightly better accuracy rates than the original one reported in the paper, weight file is available at: https://drive.google.com/open?id=13aZSApybBDjzfGIdqN1INBlPsddxCK14 or BaiduYun Driver.
-
By default, we assume you have downloaded the file in the
RFBNet/weights
dir:
mkdir weights
cd weights
wget https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth
- To train RFBNet using the train script simply specify the parameters listed in
train_RFB.py
as a flag or manually change them.
python train_test.py -d VOC -v RFB_vgg -s 300
- Note:
- -d: choose datasets, VOC or COCO.
- -v: choose backbone version, RFB_VGG, RFB_E_VGG or RFB_mobile.
- -s: image size, 300 or 512.
- You can pick-up training from a checkpoint by specifying the path as one of the training parameters (again, see
train_RFB.py
for options)
The test frequency can be found in the train_test.py
By default, it will directly output the mAP results on VOC2007 test or COCO minival2014. For VOC2012 test and COCO test-dev results, you can manually change the datasets in the test_RFB.py
file, then save the detection results and submitted to the server.
- ImageNet mobilenet
- 07+12 RFB_Net300, BaiduYun Driver,FSSD300,SSD300
- COCO RFB_Net512_E, BaiduYun Driver
- COCO RFB_Mobile Net300, BaiduYun Driver
- Add SSD and RFBNet with Harmonic DenseNet (HarDNet) as backbone models.
- Pretrained backbone models: hardnet68_base_bridge.pth | hardnet85_base.pth
- Pretrained models for COCO dataset: SSD512-HarDNet68 | SSD512-HarDNet85 | RFBNet512-HarDNet68 | RFBNet512-HarDNet85