Evaluate the accuracy of replacements

We evaluate the accuracy of the substitution using downstream NLI model.

Given the original sentence and replacement sentence , we evaluate if hypothesis entails premise . We use the Robert-MNLI (AllenNLP model for evaluation).

Since it is prohibitively expensive to run on the entire dataset, we select a subset with test count number of examples and calculate how much of them entails.

In addition to the average entailment percentage, we also return the average confidence of the prediction.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate the accuracy of replacements

Clone this wiki locally