Skip to content

zhaozj89/Educational-Question-Generation

Repository files navigation

Educational Question Generation of Children Storybooks via Question Type Distribution Learning and Event-centric Summarization

This repository is the official implementation of our paper. We consider generating high-cognitive-demand (HCD) educational questions by learning question type distribution and event-centric summarization.

Requirements

Python>=3.6 is needed, run the following commands to install requirements:

cd transformers & pip install .
pip install spacy==2.3.7
pip install torch==1.7.1
pip install pytorch-lightning==0.9.0
pip install torchtext==0.8.0
pip install rouge-score==0.0.4

Dataset

FairytaleQA can be found at here. We used the implementation in https://github.com/kelvinguu/qanli to get the QA statements. The processed QA statement data can be found at here.

NOTE: I uploaded my modified qanli at here. You'll need to first get your conllu format file and get the transformation by running step3_totxt.py (please update the paths accordingly).

Assuming we have the dataset at ./data/split and the transformed QA statements at ./data/infrence, we can prepare the needed format as follows:

python step1_toxlsx.py
python step2_topkl.py
python step3_topkllist.py

Training and Prediction

Paths need to be configured manually

  1. Question type distribution. In tdl folder,
python train.py
python predict.py
  1. Event-centric summary generation. In section2sum folder,
python train_section2sum.py
python generate_section2sum.py
  1. Educational question generation. In sum2question folder,
python train_sum2qustion.py
python generate_sum2question.py

Trained Models

  • Question type distribution here

  • Event-centric summary generation here

  • Educational question generation file1 file2, then join them as one file by join summary2question_epoch=2.ckpt.* > summary2question_epoch=2.ckpt

Highlighted Results

  • Automatic evaluation on Rouge-L and BERTScore:
  • Human evaluation on question types (the K-L distance of question type distribution between our method and groudtruth is 0.28, while QAG (top2) is 0.60):
  • Human evaluation on children appropriateness: the mean rating of our method (2.56±1.31) is significantly higher than the one of QAG (top2, 2.22±1.20).

Acknowledgement

This repository is developed based on FairytaleQA_QAG_System and FairytaleQA_Baseline.

Citation

@inproceedings{zhao2022storybookqag,
    author = {Zhao, Zhenjie and Hou, Yufang and Wang, Dakuo and Yu, Mo and Liu, Chengzhong and Ma, Xiaojuan},
    title = {Educational Question Generation of Children Storybooks via Question Type Distribution Learning and Event-Centric Summarization},
    publisher = {Association for Computational Linguistics},
    year = {2022}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published