regarding leaderboard submission #26

sorobedio · 2024-09-27T04:44:23Z

Hello, I have a set of pretrained models, and I plan to evaluate them on the MMLU-Pro benchmark without any additional training loccaly, selecting the best-performing model for submission. Is this approach valid, or could it be considered cheating?

Wyyyb · 2024-09-27T16:44:42Z

Our leaderboard is designed for evaluating single models. Manually selecting the best-performing model from a set of pre-trained models would be considered unfair. However, I think that approach is acceptable if you use techniques similar to Mixture of Experts (MoE) to automatically derive better results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

regarding leaderboard submission #26

regarding leaderboard submission #26

sorobedio commented Sep 27, 2024

Wyyyb commented Sep 27, 2024

regarding leaderboard submission #26

regarding leaderboard submission #26

Comments

sorobedio commented Sep 27, 2024

Wyyyb commented Sep 27, 2024