Skip to content

Code corresponding to our paper "Leveraging Context Information for Natural Question Generation"

Notifications You must be signed in to change notification settings

freesunshine0316/MPQG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Leveraging Context Information for Natural Question Generation

This repository contains the code for our paper Leveraging Context Information for Natural Question Generation

The code is developed under TensorFlow 1.4.1

Update about data split-1

Split-1 was originally released by Du et al., which we can't directly use as there is no information about answer positions. As a result, we use their provided doclist-xxx.txt files to generate our own data (provided along this repository). We mistakenly report their train/dev/test split in our paper.

Data

We release our data here

Data format

The current input data format for our system is in JSON style demonstrated with the following sample:

[{"text1":"IBM is headquartered in Armonk , NY .", "annotation1": {"toks":"IBM is headquartered in Armonk , NY .", "POSs":"NNP VBZ VBN IN NNP , NNP .","NERs":"ORG O O O LOC O LOC ."},
 {"text2":"Where is IBM located ?", "annotation2": {"toks":"Where is IBM located ?", "POSs":"WRB VBZ NNP VBN .","NERs":"O O ORG O O"},
 {"text3":"Armonk , NY", "annotation3": {"toks":"Armonk , NY", "POSs":"NNP , NNP","NERs":"LOC O LOC"}
}]

where "text1" and "annotation1" correspond to the text and rich annotations for the passage. Similarly, "text2" and "text3" correspond to the question and answer parts, respectively.

Please note that the rich annotation isn't necessary for our system, so you can simply modify the data loading code to not requiring the "annotation" fields.

Important update on data format

Now annotations fields are not required in our latest system. So you can feed it with data sample like:

[{"text1":"IBM is headquartered in Armonk , NY .", 
 {"text2":"Where is IBM located ?", 
 {"text3":"Armonk , NY"
}]

Training

For model training, simply execute

python NP2P_trainer.py --config_path config.json

where config.json is a JSON file containing all hyperparameters. We attach a sample config file along with our repository.

Decoding

For decoding, simply execute

python NP2P_beam_decoder.py --model_prefix xxx --in_path yyy --out_path zzz --mode beam

Cite

If you like our work, please cite:

@inproceedings{song2018leveraging,
  title={Leveraging Context Information for Natural Question Generation},
  author={Song, Linfeng and Wang, Zhiguo and Hamza, Wael and Zhang, Yue and Gildea, Daniel},
  booktitle={Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
  pages={569--574},
  year={2018}
}

About

Code corresponding to our paper "Leveraging Context Information for Natural Question Generation"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published