This repository contains code for the Merger Agreement Understanding Dataset (MAUD), a dataset for merger agreement review curated by the Atticus Project and used in the 2021 American Bar Association Public Target Deal Points Study.
First, install pytorch with GPU support for your distribution: https://pytorch.org/get-started/locally/
Then, run pip install -e .
Unzip the data files with unzip data.zip
.
Best found hyperparameters and corresponding validation scores, are available in the CSVs best_found_hps/*.csv
.
Run scripts/train.sh
and scripts/train_multi.sh
to train models on best hyperparameters.
Run scripts/evaluate.sh
to evaluate models afterwards.
If you find MAUD useful in your research, please consider citing:
@misc{wang2023maud,
title={MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding},
author={Steven H. Wang and Antoine Scardigli and Leonard Tang and Wei Chen and Dimitry Levkin and Anya Chen and Spencer Ball and Thomas Woodside and Oliver Zhang and Dan Hendrycks},
year={2023},
eprint={2301.00876},
archivePrefix={arXiv},
primaryClass={cs.CL}
}