-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Llama3.1 #664
Comments
Hi David, FYI I'm able to load Llama-3.1-8B with optimum 0.0.23 and a manual upgrade to the latest transformers. No compilation is required, NEFFs are loaded from the cache.
However,
Environment:
I hope you can fix this soon :) Thanks! |
when i tried 763104351884.dkr.ecr.ap-southeast-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.1.2-optimum0.0.23-neuronx-py310-ubuntu22.04 from AWS which points to 0.0.23 optimum-neuron and the deployment failed with error ValueError: which is caused by a lower transformer version in this container version I believe |
Hello David, any progress on this? appreciate it, seeing 763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.1.2-optimum0.0.24-neuronx-py310-ubuntu22.04 still uses transformer 4.41.1 |
@grhaonan I am working on it. You can track progress in my dev branch: https://github.com/huggingface/optimum-neuron/commits/bump_transformers/. |
Hey @dacorvo , seems like the branch was merged, but llama 3.1 is still not supported, are there any other action items that needs to be done first? |
+1 |
It is supported, but only if you build your own image for now. |
thanks @dacorvo |
Feature request
Llama 3.1 is out and should be compatible with Neuron, however, it requires
transformers==4.43.1
, andoptimum-neuron
has pinnedtransformers
to4.41.1
.Notes that since
optimum
also pinstransformers
version to a specific range,optimum
must also be modified as a prerequisite (see huggingface/optimum#1968).Motivation
Everybody wants the latest Llama.
Your contribution
Most fo the changes are likely to be related to training, but I will be happy to review.
The text was updated successfully, but these errors were encountered: