Paper accepted at 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops |
---|
| | | | | |:-:|:-:|:-:|:-:|:-:| | Amaia Salvador | Xavier Giro-i-Nieto | Ferran Marques | Shin'ichi Satoh |
A joint collaboration between:
Universitat Politecnica de Catalunya (UPC) | UPC ETSETB TelecomBCN | UPC Image Processing Group | National Institute of Informatics |
Image representations derived from pre-trained Convolutional Neural Networks (CNNs) have become the new state of the art in computer vision tasks such as instance retrieval. This work explores the suitability for instance retrieval of image- and region-wise representations pooled from an object detection CNN such as Faster R-CNN. We take advantage of the object proposals learned by a Region Proposal Network (RPN) and their associated CNN features to build an instance search pipeline composed of a first filtering stage followed by a spatial reranking. We further investigate the suitability of Faster R-CNN features when the network is fine-tuned for the same objects one wants to retrieve. We assess the performance of our proposed system with the Oxford Buildings 5k, Paris Buildings 6k and a subset of TRECVid Instance Search 2013, achieving competitive results.
You can find our paper in the Proceedings of the DeepVision: Deep Learning in Computer Vision Workshop at CVPR 2016. Our preprint is also available on arXiv.
Please cite with the following Bibtex code:
@InProceedings{Salvador_2016_CVPR_Workshops,
author = {Salvador, Amaia and Giro-i-Nieto, Xavier and Marques, Ferran and Satoh, Shin'ichi},
title = {Faster R-CNN Features for Instance Search},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2016}
}
You may also want to refer to our publication with the more human-friendly Chicago style:
Amaia Salvador, Xavier Giro-i-Nieto, Ferran Marques and Shin'ichi Satoh. "Faster R-CNN Features for Instance Search." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 2016.
2016-05-Seminar-AmaiaSalvador-DeepVision from Image Processing Group on Vimeo.
<iframe src="//www.slideshare.net/slideshow/embed_code/key/lZzb4HdY6OEZ01" width="595" height="485" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" allowfullscreen> </iframe>This python repository contains the necessary tools to reproduce the retrieval pipeline based on off-the-shelf Faster R-CNN features.
- You need to download and install Faster R-CNN python implementation by Ross Girshick. Point
params['fast_rcnn_path']
to the Faster R-CNN root path inparams.py
. - Download Oxford and Paris Buildings datasets. There are scripts under
data/images/paris
anddata/images/oxford/
that will do that for you. - Download Faster R-CNN models by running
data/models/fetch_models.sh
.
- Data preparation. Run
read_data.py
to create the lists of query and database images. Run this twice changingparams['dataset']
to'oxford'
and'paris'
. - Feature Extraction. Run
features.py
to extract Fast R-CNN features for all images in a dataset and store them to disk. - Ranking. Run
ranker.py
to generate and store the rankings for the queries of the chosen dataset. - Rerank based on region features by running
rerank.py
. - Evaluation. Run
eval.py
to obtain the Average Precision. - Visualization. Run
vis.py
to populatedata/figures
with the visualization of the top generated rankings for each query.
We would like to especially thank Albert Gil Moreno and Josep Pujal from our technical support team at the Image Processing Group at UPC.
Albert Gil | Josep Pujal |
We gratefully acknowledge the support of NVIDIA Corporation with the donation of the GeForce GTX Titan Z and Titan X used in this work. | |
The Image ProcessingGroup at the UPC is a SGR14 Consolidated Research Group recognized and sponsored by the Catalan Government (Generalitat de Catalunya) through its AGAUR office. | |
This work has been developed in the framework of the project BigGraph TEC2013-43935-R, funded by the Spanish Ministerio de Economía y Competitividad and the European Regional Development Fund (ERDF). |
If you have any general doubt about our work or code which may be of interest for other researchers, please use the public issues section on this github repo. Alternatively, drop us an e-mail at [email protected] or [email protected].