-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can you provide some examples of llama and gemma on common benchmarks? #1978
Comments
Hi @pass-lin, The links provided aren't working. For reproducing these models from keras, we need to follow below steps:
Please refer to this gist file. Could you please refer to this issue link that contains detailed information about reproducing the models on the GSM8K dataset. Also refer to LLaMA3-Quantization. Please note that the actual performance may vary based on the specific implementation details, such as prompt formatting and answer extraction methods. Thank you. |
your code find some error
you will find model can not generate normally,the generate answer is equal to input question |
Hi @pass-lin, Try adjusting the temperature and top-k sampling parameters; this should help the model generate the expected output or answer. Currently, the generated answer is same to the input answer, not the question. Thank you. |
I am unable to reproduce the performance of llama3 and gemma2 implemented by Keras Hub on the GSM8k benchmark.
paper ref https://arxiv.org/pdf/[2407.21783](https://arxiv.org/pdf/2407.21783) and https://arxiv.org/pdf/[2408.00118](https://arxiv.org/pdf/2408.00118)
Could the keras team please provide an example for replicating the results, and also compare the performance across different backends?
The text was updated successfully, but these errors were encountered: