Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help me understand the HELM Classic Leaderboard's missing results #2994

Open
PaulJoeMaliakel opened this issue Sep 16, 2024 · 1 comment
Open

Comments

@PaulJoeMaliakel
Copy link

Screenshot from 2024-09-16 14-16-46

Why were many models not evaluated on tasks like HellaSwag, OpenBookQA, MS MARCO, and summarization tasks like XSUM and CNN/Daily Mail? Is it because they are not suitable for these tasks?

@yifanmai
Copy link
Collaborator

Yes - many scenarios on HELM Classic require logprobs from the model, because they use an adapter that requires logprobs. Many recent model APIs do not provide logprobs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants