Skip to content

Evaluate the accuracy of replacements

Mokanarangan Thayaparan edited this page Oct 31, 2022 · 2 revisions

We evaluate the accuracy of the substitution using downstream NLI model.

Given the original sentence and replacement sentence , we evaluate if hypothesis entails premise . We use the Robert-MNLI (AllenNLP model for evaluation).

Since it is prohibitively expensive to run on the entire dataset, we select a subset with test count number of examples and calculate how much of them entails.

In addition to the average entailment percentage, we also return the average confidence of the prediction.

Clone this wiki locally