Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model Export & Inference #502

Open
karpathy opened this issue May 30, 2024 · 3 comments
Open

Model Export & Inference #502

karpathy opened this issue May 30, 2024 · 3 comments

Comments

@karpathy
Copy link
Owner

I'd be very interested in how we could take llm.c models and export them into universal formats, e.g. for very fast inference in llama.cpp, vllm, or etc. Or how they could be made HuggingFace compatible. This would also allow us to run more comprehensive evals on the models that we train in llm.c, because it would (hopefully) slot into other existing infrastructure in those projects.

@YuchenJin
Copy link
Contributor

Most inference frameworks including vllm and llama.cpp support the safetensors format.

In theory, we can write a utility python script:

  1. Load the binary file generated by llm.c;
  2. Organize the weights into a dictionary and convert to PyTorch model state_dict;
  3. Save the weights using safetensors:
from safetensors.torch import save_file
save_file(state_dict, 'model.safetensors')

This will also help us use libraries such as lighteval to perform broad evaluations across more benchmarks.

@karpathy
Copy link
Owner Author

karpathy commented Jun 4, 2024

@YuchenJin yep exactly what I had in mind! I put up the issue because I am sequencing other things before I get around to it, possibly someone can pick it up in parallel before.

@YuchenJin
Copy link
Contributor

Cool, I will give it a shot if no one starts working on it by mid-next week. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants