Add Math Evaluation with Judge Model Evaluator #1077

liushz · 2024-04-24T05:24:59Z

An OpenCompass version of OpenAI-simple-evals for MATH dataset
The default judgemodel of postprocess is Llama-3-70B-Instruct

model	total
llama-3-8b-instruct-hf	0.273

liuhongwei added 3 commits April 24, 2024 13:08

Add Math Evaluation with Judge Model Evaluator

feec5eb

Add Math Evaluation with Judge Model Evaluator

096b0d2

Add Math Evaluation with Judge Model Evaluator

71b92d7

liushz requested a review from bittersweet1999 April 24, 2024 05:24

mm-assistant bot assigned liushz Apr 24, 2024

liushz temporarily deployed to prod April 24, 2024 05:25 — with GitHub Actions Inactive

liushz changed the title ~~Lhw add math judgement~~ Add Math Evaluation with Judge Model Evaluator Apr 24, 2024

Add Math Evaluation with Judge Model Evaluator

fdfcf7c

liushz temporarily deployed to prod April 24, 2024 07:22 — with GitHub Actions Inactive

bittersweet1999 closed this Apr 30, 2024

liushz deleted the lhw_add_math_judgement branch May 21, 2024 06:35

Provide feedback