Point and Ask: Incorporating Pointing into VQA

This is a release of the code for the paper "Point and Ask: Incorporating Pointing Into Visual Question Answering".

@article{mani2020point,
      title={Point and Ask: Incorporating Pointing into Visual Question Answering}, 
      author={Arjun Mani and Nobline Yoo and Will Hinthorn and Olga Russakovsky},
      year={2020},
      eprint={2011.13681},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Requirements

The following packages are required to run the models (datasets already provided):

Python 3.6
Pytorch 1.5+
Pythia v0.3 (MMF v0.4 also acceptable with refactoring)
vqa-maskrcnn-benchmark
tensorboardX

Data

See datasets folder for all datasets: Local-QA, LookTwice-QA, and Verbal/Spatial Disambiguation (General-QA coming soon!) Each dataset is a JSON file (README associated with each dataset describes formatting). Train/val/test image splits are also included for each dataset.

Models

You can train either the Local-QA-Model or Global-QA-Model on any dataset. For example, to run the Local-QA-Model on the Local-QA Dataset, run:

python3 -W ignore ./tools/run.py --tasks vqa --datasets vqamb --model pythia --config configs/vqa/objpart/pythia.yml

Config options in pythia.yml should be set according to your needs (can also be modified using command line arguments). Please see the models folder for more details. Also includes details on region/grid feature extraction and the semantic segmentation model for Intent-QA.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
Datasets		Datasets
dataset-code		dataset-code
models		models
README.md		README.md
pointingqa.png		pointingqa.png
pointingqa2.png		pointingqa2.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Point and Ask: Incorporating Pointing into VQA

Requirements

Data

Models

About

Releases

Packages

Contributors 2

Languages

princetonvisualai/pointingqa

Folders and files

Latest commit

History

Repository files navigation

Point and Ask: Incorporating Pointing into VQA

Requirements

Data

Models

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages