Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

You are calling save_pretrained to a 4-bit converted model, but your bitsandbytes version doesn't support it. #3951

Open
shripadk opened this issue Mar 3, 2024 · 4 comments
Labels
llm Large Language Model related

Comments

@shripadk
Copy link

shripadk commented Mar 3, 2024

Describe the bug

I have enabled 4-bit quantization for fine tuning mistralai/Mistral-7B-v0.1. Seems like Ludwig 0.10.1 depends on bitsandbytes < 0.41.0. But when I run the trainer I get the following warning:

You are calling `save_pretrained` to a 4-bit converted model, but your `bitsandbytes` version doesn't support it. 
If you want to save 4-bit models, make sure to have `bitsandbytes>=0.41.3` installed.

To Reproduce
Steps to reproduce the behavior:

  1. Install Ludwig
pip install ludwig[full]
  1. Config file (model.yaml):
model_type: llm
base_model: mistralai/Mistral-7B-v0.1

quantization:
  bits: 4

adapter:
  type: lora

prompt:
  template: |
    ### Instruction:
    {instruction}

    ### Input:
    {input}

    ### Response:

input_features:
  - name: prompt
    type: text

output_features:
  - name: output
    type: text

generation:
  temperature: 0.1

trainer:
  type: finetune
  epochs: 3
  optimizer:
    type: paged_adam
  batch_size: 1
  eval_steps: 100
  learning_rate: 0.0002
  eval_batch_size: 2
  steps_per_checkpoint: 1000
  learning_rate_scheduler:
    decay: cosine
    warmup_fraction: 0.03
  gradient_accumulation_steps: 16
  enable_gradient_checkpointing: true

preprocessing:
  sample_ratio: 0.1
  1. Train the model:
ludwig train --config model.yaml --dataset "ludwig://alpaca"

Expected behavior
Should not show the warning on bitsandbytes version not supporting save_pretrained for 4-bit quantization.

Environment (please complete the following information):

  • OS: Linux
  • Version: 6.7.6-arch1-1
  • Python: 3.10.8
  • Ludwig: v0.10.1

@alexsherstinsky

@yogeshhk
Copy link

yogeshhk commented Mar 4, 2024

Here is the notebook showing the run... First run asked for a RESTART, after doing that and running all the cells, the output is https://colab.research.google.com/drive/1kmZhQKBzpHBJRJvvp9PEdPEUMfMu6dh7?usp=sharing Just FYI.... btw, the output of the model is "","", but that's most likely an issue with the base model!! [
[@shripadk @alexsherstinsky]

@alexsherstinsky
Copy link
Collaborator

@shripadk Are you still having the issues? A new version of Ludwig will be release next week (you may wish to try again). Please keep an eye on the release announcement next week in our Discord. Thank you!

@alexsherstinsky alexsherstinsky added the llm Large Language Model related label Jul 26, 2024
@shripadk
Copy link
Author

@alexsherstinsky thanks for the heads up. I'll definitely take a look at it and get back to you on this. Will surely keep an eye on the release. Thanks again 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llm Large Language Model related
Projects
None yet
Development

No branches or pull requests

3 participants