This is a PyTorch implementation of MP-CNN as a base model with modifications and additions such as attention and sparse features.
Here is the MP-CNN paper:
- Hua He, Kevin Gimpel, and Jimmy Lin. Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), pages 1576-1586.
The datasets are available in https://git.uwaterloo.ca/jimmylin/Castor-data, as well as the GloVe word embeddings.
Directory layout should be like this:
├── MP-CNN-Variants
│ ├── README.md
│ ├── ...
├── Castor-data
│ ├── README.md
│ ├── ...
│ ├── msrvid/
│ ├── sick/
│ └── GloVe/
Note the original paper doesn't use dropout, so dropout=0 mimics this behaviour to allow for fair comparison in the results reported below.
To visualize the training process, just add --tensorboard
to use TensorBoard.
To run MP-CNN on the SICK dataset mimicking the original paper as closely as possible, use the following command:
python main.py mpcnn.sick.model --dataset sick --epochs 19 --dropout 0 --lr 0.0005
Implementation and config | Pearson's r | Spearman's p | MSE |
---|---|---|---|
Paper | 0.8686 | 0.8047 | 0.2606 |
PyTorch using above config | 0.8692 | 0.8145 | 0.2520 |
To run MP-CNN on TrecQA, you first need to run the get_trec_eval.sh
script in utils
.
Then, you can run:
python main.py mpcnn.trecqa.model --arch mpcnn --dataset trecqa --epochs 5 --holistic-filters 200 --lr 0.00018 --regularization 0.0006405 --dropout 0
Implementation and config | map | mrr |
---|---|---|
Paper | 0.762 | 0.854 |
PyTorch using above config | 0.774 | 0.836 |
The paper results are reported in Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks.
You also need trec_eval
for this dataset, similar to TrecQA.
Then, you can run:
python main.py mpcnn.wikiqa.model --arch mpcnn --dataset wikiqa --epochs 5 --holistic-filters 100 --lr 0.0001 --regularization 0.0002 --dropout 0
Implementation and config | map | mrr |
---|---|---|
Paper | 0.693 | 0.709 |
PyTorch using above config | 0.699 | 0.714 |
The paper results are reported in Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks.
To run MP-CNN on the MSRVID dataset, use the following command:
python main.py mpcnn.msrvid.model --dataset msrvid --batch-size 16 --lr 0.0005 --epsilon 1e-7 --epochs 32 --dropout 0 --regularization 0.001
You should be able to obtain Pearson's p to be 0.8980 (untuned), for reference the performance in the paper is 0.9090.
To run MP-CNN on the MSRP dataset, use the following command:
python main.py mpcnn.msrp.model --dataset msrp --epochs 15
To see all options available, use
python main.py --help
There are some scripts in this repo for hyperparameter optimization using watermill with some hacks since the library is in alpha. Hence, the imports in hyperparameter_tuning_{random,hyperband}.py
and utils/hyperband.py
will not work for you at the moment.
For results, please see my Master's thesis here:
@mastersthesis{tu2018experimental,
title={An Experimental Analysis of Multi-Perspective Convolutional Neural Networks},
author={Tu, Zhucheng},
year={2018},
school={University of Waterloo}
}