Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Export Hugging Face models to ggml format using a dedicated exporter. Right now I added the
bloom
andstarcoder
architectures, only for PyTorch and without quantization. In the current PR, adding a new architecture would look as follows:Now I am a bit overwhelmed by the diversity between only
starcoder
andbloom
already. This makes me question whether a solution like this would be too complex to support a wide range of models, hence I am opening it early as a draft PR to gather some feedback early before polishing any further.To try it yourself, see these testcases tests/exporters/ggml/test_ggml_export.py and make sure the ggml.cpp submodule is loaded correctly (I think you need to explicitly pull it for the source code to appear).
Related issue: #903 (cc @fxmarty @NouamaneTazi @sidistic)
Before submitting