Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document (1) how to use local models, (2) which model classes are supported by prompt2model #356

Closed
chensimian opened this issue Sep 20, 2023 · 12 comments
Labels
documentation Improvements or additions to documentation

Comments

@chensimian
Copy link

How to change a downloaded model to a local model without downloading the model from Hugging Face?

@neubig
Copy link
Collaborator

neubig commented Sep 20, 2023

Thanks for the interest @chensimian , can I clarify your request? Basically, prompt2model identifies the most useful model to fine-tune on the Hugging Face hub, but instead of using a model from Hugging Face you'd like to use one on your local disk?

I think that you can probably do this by replacing the name of the pre-trained model that is passed into GenerationModelTrainer with the path to the local model (/path/to/my/model below):

trainer = GenerationModelTrainer(
    "/path/to/my/model",
    has_encoder=True,
    executor_batch_size=batch_size,
    tokenizer_max_length=1024,
    sequence_max_length=1280,
)

Please tell us if this works, or doesn't work.

@chensimian
Copy link
Author

Which models are supported, such as Llama or Baichuan?

@chensimian
Copy link
Author

chensimian commented Sep 21, 2023

Where is the downloaded Hugging Face model stored in the directory?

@chensimian
Copy link
Author

Thanks for the interest @chensimian , can I clarify your request? Basically, prompt2model identifies the most useful model to fine-tune on the Hugging Face hub, but instead of using a model from Hugging Face you'd like to use one on your local disk?

I think that you can probably do this by replacing the name of the pre-trained model that is passed into GenerationModelTrainer with the path to the local model (/path/to/my/model below):

trainer = GenerationModelTrainer(
    "/path/to/my/model",
    has_encoder=True,
    executor_batch_size=batch_size,
    tokenizer_max_length=1024,
    sequence_max_length=1280,
)

Please tell us if this works, or doesn't work.

I tried the modification you suggested, but it didn't work. It still redirects to the website. For example, the following error occurs:
ValueError: Loading this model requires you to execute execute some code in that repo on your local machine. Make sure you have read the code at https://hf.co//mnt/afs/data/model/open_source_data/chatglm2-6b to avoid malicious use, then set the option trust_remote_code=True to remove this error.

@neubig
Copy link
Collaborator

neubig commented Sep 21, 2023

Hi @chensimian, thanks for clarifying! Let me respond to the questions. I think both of the first two questions should be documented, so I'd like to leave this issue open until we document them.

Which models are supported, such as Llama or Baichuan?

We support all models that are supported by Hugging Face's AutoModelForCausalLM and AutoModelForSeq2SeqLM, so Llama and Baichuan should be supported. However, by default the prompt2model model retriever only retrieves models that are less than 3GB on disk, so many of these models will be excluded as being too big.

Where is the downloaded Fugging Face model stored in the directory?

This is stored in your standard hugging face cache directory.

the following error occurs:
ValueError: Loading this model requires you to execute execute some code in that repo on your local machine. Make sure you have read the code at https://hf.co//mnt/afs/data/model/open_source_data/chatglm2-6b to avoid malicious use, then set the option trust_remote_code=True to remove this error.

This is a different error related to use of models that require execution of code that is not trusted by Hugging Face (so you need to give special permission). I will created a separate issue for supporting this. We would be happy to accept a PR if you can fix this!

@neubig neubig added the documentation Improvements or additions to documentation label Sep 21, 2023
@neubig neubig changed the title change model Document how to (1) use local models, (2) which model classes are supported by prompt2model Sep 21, 2023
@neubig neubig changed the title Document how to (1) use local models, (2) which model classes are supported by prompt2model Document (1) how to use local models, (2) which model classes are supported by prompt2model Sep 21, 2023
@chensimian
Copy link
Author

Hi @chensimian, thanks for clarifying! Let me respond to the questions. I think both of the first two questions should be documented, so I'd like to leave this issue open until we document them.

Which models are supported, such as Llama or Baichuan?

We support all models that are supported by Hugging Face's AutoModelForCausalLM and AutoModelForSeq2SeqLM, so Llama and Baichuan should be supported. However, by default the prompt2model model retriever only retrieves models that are less than 3GB on disk, so many of these models will be excluded as being too big.

Where is the downloaded Fugging Face model stored in the directory?

This is stored in your standard hugging face cache directory.

the following error occurs:
ValueError: Loading this model requires you to execute execute some code in that repo on your local machine. Make sure you have read the code at https://hf.co//mnt/afs/data/model/open_source_data/chatglm2-6b to avoid malicious use, then set the option trust_remote_code=True to remove this error.

This is a different error related to use of models that require execution of code that is not trusted by Hugging Face (so you need to give special permission). I will created a separate issue for supporting this. We would be happy to accept a PR if you can fix this!

You say prompt2model model retriever only retrieves models that are less than 3GB on disk, so many of these models will be excluded as being too big. But I want to use a model that is larger than 3GB. What should I do?

@neubig
Copy link
Collaborator

neubig commented Sep 21, 2023

If you're downloading and loading the model locally (as you asked in the beginning of this thread), then this is not a problem. Prompt2model will happily train that model for you.

If you want to use the model retriever, we will need to fix issue #273 first to make the maximum model size configurable.

@chensimian
Copy link
Author

If you're downloading and loading the model locally (as you asked in the beginning of this thread), then this is not a problem. Prompt2model will happily train that model for you.

If you want to use the model retriever, we will need to fix issue #273 first to make the maximum model size configurable.

Are you asking if the issue you mentioned is still unresolved? For example, when I try to train using the chatgpt 6b model, I encounter this error:ValueError: Expected input batch_size (284) to match target batch_size (51).

@viswavi
Copy link
Collaborator

viswavi commented Sep 21, 2023

Hi @chensimian, I haven't tried with this model (and I think you mean chatglm-6b?). Can you please share the full stack trace for that ValueError to help us debug? Thank you.

@chensimian
Copy link
Author

Hi @chensimian, I haven't tried with this model (and I think you mean chatglm-6b?). Can you please share the full stack trace for that ValueError to help us debug? Thank you.
When I switch to training with a local small model, such as flan-t5-base, the following error occurs:
FileNotFoundError: Couldn't find a module script at /prompt2model-main/chrf/chrf.py. Module 'chrf' doesn't exist on the Hugging Face Hub either.

@chensimian
Copy link
Author

我们支持AutoModelForCausalLMHugging Face和所支持的所有模型AutoModelForSeq2SeqLM,因此 Llama 和 Baichuan 应该得到支持。但是,默认情况下,prompt2model 模型检索器仅检索磁盘上小于 3GB 的模型,因此其中许多模型将因太大而被排除。

@chensimian
Copy link
Author

Hi @chensimian, thanks for clarifying! Let me respond to the questions. I think both of the first two questions should be documented, so I'd like to leave this issue open until we document them.

Which models are supported, such as Llama or Baichuan?

We support all models that are supported by Hugging Face's AutoModelForCausalLM and AutoModelForSeq2SeqLM, so Llama and Baichuan should be supported. However, by default the prompt2model model retriever only retrieves models that are less than 3GB on disk, so many of these models will be excluded as being too big.

Where is the downloaded Fugging Face model stored in the directory?

This is stored in your standard hugging face cache directory.

the following error occurs:
ValueError: Loading this model requires you to execute execute some code in that repo on your local machine. Make sure you have read the code at https://hf.co//mnt/afs/data/model/open_source_data/chatglm2-6b to avoid malicious use, then set the option trust_remote_code=True to remove this error.

This is a different error related to use of models that require execution of code that is not trusted by Hugging Face (so you need to give special permission). I will created a separate issue for supporting this. We would be happy to accept a PR if you can fix this!

I don't use the model retriever to retrieve models. For example, if I train the model of Baichuan directly on the disk, the following errors will occur. Error: Model type should be one of BartConfig, BigBirdPegasusConfig, BlenderbotConfig, BlenderbotSmallConfig, EncoderDecoderConfig, FSMTConfig, GPTSanJapaneseConfig, LEDConfig, LongT5Config, M2M100Config, MarianConfig, MBartConfig, MT5Config, MvpConfig, NllbMoeConfig, PegasusConfig, PegasusXConfig, PLBartConfig, ProphetNetConfig, SwitchTransformersConfig, T5Config, UMT5Config, XLMProphetNetConfig.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants