20140514
- Error analysis of YOLO compared to Fast R-CNN shows that YOLO makes a significant number of localization errors.
- YOLO has relatively low recall compared to region proposal-based methods.
- BN on all of the convolutional layers in YOLO, we get more than 2% improvement in mAP. BN also help us remove dropout from the model without overfitting.
- increases the resolution to 448*448 for detection
- We remove the fully connected layers from YOLO and use anchor boxes to predict bounding boxes. 4.Ddecouple the class prediction mechanism from the spatial location and instead predict class and objectness for every anchor box.
- Run k-means clustering on the training set bounding boxes to automatically find good priors.
- To get a range of resolutions: simply adding a passthrough layer that brings features from an earlier layer at 26 � 26 resolution.
- To learn to predict well across a variety of input dimensions:During tainning, every 10 batches our network randomly chooses a new image dimension size.
- darknet-19
- training
- Classification:we use standard data augmentation tricks including random crops, rotations, and hue, saturation, and exposure shifts.
- Detection: removing the last convolutional layer and instead adding on three 3x3 convolutional layers with 1024 filters each followed by a final 1x1 convolutional layer with the number of outputs we need for detection.
- jointly training
[15] -- hand-picked priors in Fast RCNN