Document (1) how to use local models, (2) which model classes are supported by prompt2model #356

chensimian · 2023-09-20T09:31:24Z

How to change a downloaded model to a local model without downloading the model from Hugging Face?

neubig · 2023-09-20T12:27:02Z

Thanks for the interest @chensimian , can I clarify your request? Basically, prompt2model identifies the most useful model to fine-tune on the Hugging Face hub, but instead of using a model from Hugging Face you'd like to use one on your local disk?

I think that you can probably do this by replacing the name of the pre-trained model that is passed into GenerationModelTrainer with the path to the local model (/path/to/my/model below):

trainer = GenerationModelTrainer(
    "/path/to/my/model",
    has_encoder=True,
    executor_batch_size=batch_size,
    tokenizer_max_length=1024,
    sequence_max_length=1280,
)

Please tell us if this works, or doesn't work.

chensimian · 2023-09-21T03:22:22Z

Which models are supported, such as Llama or Baichuan?

chensimian · 2023-09-21T03:43:58Z

Where is the downloaded Hugging Face model stored in the directory?

chensimian · 2023-09-21T08:02:44Z

Thanks for the interest @chensimian , can I clarify your request? Basically, prompt2model identifies the most useful model to fine-tune on the Hugging Face hub, but instead of using a model from Hugging Face you'd like to use one on your local disk?

I think that you can probably do this by replacing the name of the pre-trained model that is passed into GenerationModelTrainer with the path to the local model (/path/to/my/model below):
trainer = GenerationModelTrainer(
    "/path/to/my/model",
    has_encoder=True,
    executor_batch_size=batch_size,
    tokenizer_max_length=1024,
    sequence_max_length=1280,
)
Please tell us if this works, or doesn't work.

I tried the modification you suggested, but it didn't work. It still redirects to the website. For example, the following error occurs:
ValueError: Loading this model requires you to execute execute some code in that repo on your local machine. Make sure you have read the code at https://hf.co//mnt/afs/data/model/open_source_data/chatglm2-6b to avoid malicious use, then set the option trust_remote_code=True to remove this error.

neubig · 2023-09-21T10:07:25Z

Hi @chensimian, thanks for clarifying! Let me respond to the questions. I think both of the first two questions should be documented, so I'd like to leave this issue open until we document them.

Which models are supported, such as Llama or Baichuan?

We support all models that are supported by Hugging Face's AutoModelForCausalLM and AutoModelForSeq2SeqLM, so Llama and Baichuan should be supported. However, by default the prompt2model model retriever only retrieves models that are less than 3GB on disk, so many of these models will be excluded as being too big.

Where is the downloaded Fugging Face model stored in the directory?

This is stored in your standard hugging face cache directory.

the following error occurs:
ValueError: Loading this model requires you to execute execute some code in that repo on your local machine. Make sure you have read the code at https://hf.co//mnt/afs/data/model/open_source_data/chatglm2-6b to avoid malicious use, then set the option trust_remote_code=True to remove this error.

This is a different error related to use of models that require execution of code that is not trusted by Hugging Face (so you need to give special permission). I will created a separate issue for supporting this. We would be happy to accept a PR if you can fix this!

chensimian · 2023-09-21T12:24:24Z

Hi @chensimian, thanks for clarifying! Let me respond to the questions. I think both of the first two questions should be documented, so I'd like to leave this issue open until we document them.

Which models are supported, such as Llama or Baichuan?

We support all models that are supported by Hugging Face's AutoModelForCausalLM and AutoModelForSeq2SeqLM, so Llama and Baichuan should be supported. However, by default the prompt2model model retriever only retrieves models that are less than 3GB on disk, so many of these models will be excluded as being too big.

Where is the downloaded Fugging Face model stored in the directory?

This is stored in your standard hugging face cache directory.

the following error occurs:
ValueError: Loading this model requires you to execute execute some code in that repo on your local machine. Make sure you have read the code at https://hf.co//mnt/afs/data/model/open_source_data/chatglm2-6b to avoid malicious use, then set the option trust_remote_code=True to remove this error.

This is a different error related to use of models that require execution of code that is not trusted by Hugging Face (so you need to give special permission). I will created a separate issue for supporting this. We would be happy to accept a PR if you can fix this!

You say prompt2model model retriever only retrieves models that are less than 3GB on disk, so many of these models will be excluded as being too big. But I want to use a model that is larger than 3GB. What should I do?

neubig · 2023-09-21T12:29:28Z

If you're downloading and loading the model locally (as you asked in the beginning of this thread), then this is not a problem. Prompt2model will happily train that model for you.

If you want to use the model retriever, we will need to fix issue #273 first to make the maximum model size configurable.

chensimian · 2023-09-21T12:49:36Z

If you're downloading and loading the model locally (as you asked in the beginning of this thread), then this is not a problem. Prompt2model will happily train that model for you.

If you want to use the model retriever, we will need to fix issue #273 first to make the maximum model size configurable.

Are you asking if the issue you mentioned is still unresolved? For example, when I try to train using the chatgpt 6b model, I encounter this error：ValueError: Expected input batch_size (284) to match target batch_size (51).

viswavi · 2023-09-21T14:42:17Z

Hi @chensimian, I haven't tried with this model (and I think you mean chatglm-6b?). Can you please share the full stack trace for that ValueError to help us debug? Thank you.

chensimian · 2023-09-22T08:10:13Z

Hi @chensimian, I haven't tried with this model (and I think you mean chatglm-6b?). Can you please share the full stack trace for that ValueError to help us debug? Thank you.
When I switch to training with a local small model, such as flan-t5-base, the following error occurs:
FileNotFoundError: Couldn't find a module script at /prompt2model-main/chrf/chrf.py. Module 'chrf' doesn't exist on the Hugging Face Hub either.

chensimian · 2023-10-09T03:29:35Z

我们支持AutoModelForCausalLMHugging Face和所支持的所有模型AutoModelForSeq2SeqLM，因此 Llama 和 Baichuan 应该得到支持。但是，默认情况下，prompt2model 模型检索器仅检索磁盘上小于 3GB 的模型，因此其中许多模型将因太大而被排除。

chensimian · 2023-10-09T03:34:14Z

Hi @chensimian, thanks for clarifying! Let me respond to the questions. I think both of the first two questions should be documented, so I'd like to leave this issue open until we document them.

Which models are supported, such as Llama or Baichuan?

We support all models that are supported by Hugging Face's AutoModelForCausalLM and AutoModelForSeq2SeqLM, so Llama and Baichuan should be supported. However, by default the prompt2model model retriever only retrieves models that are less than 3GB on disk, so many of these models will be excluded as being too big.

Where is the downloaded Fugging Face model stored in the directory?

This is stored in your standard hugging face cache directory.

the following error occurs:
ValueError: Loading this model requires you to execute execute some code in that repo on your local machine. Make sure you have read the code at https://hf.co//mnt/afs/data/model/open_source_data/chatglm2-6b to avoid malicious use, then set the option trust_remote_code=True to remove this error.

This is a different error related to use of models that require execution of code that is not trusted by Hugging Face (so you need to give special permission). I will created a separate issue for supporting this. We would be happy to accept a PR if you can fix this!

I don't use the model retriever to retrieve models. For example, if I train the model of Baichuan directly on the disk, the following errors will occur. Error: Model type should be one of BartConfig, BigBirdPegasusConfig, BlenderbotConfig, BlenderbotSmallConfig, EncoderDecoderConfig, FSMTConfig, GPTSanJapaneseConfig, LEDConfig, LongT5Config, M2M100Config, MarianConfig, MBartConfig, MT5Config, MvpConfig, NllbMoeConfig, PegasusConfig, PegasusXConfig, PLBartConfig, ProphetNetConfig, SwitchTransformersConfig, T5Config, UMT5Config, XLMProphetNetConfig.

neubig mentioned this issue Sep 21, 2023

Support models that require trust_remote_code #357

Open

neubig added the documentation Improvements or additions to documentation label Sep 21, 2023

neubig changed the title ~~change model~~ Document how to (1) use local models, (2) which model classes are supported by prompt2model Sep 21, 2023

neubig changed the title ~~Document how to (1) use local models, (2) which model classes are supported by prompt2model~~ Document (1) how to use local models, (2) which model classes are supported by prompt2model Sep 21, 2023

chensimian closed this as completed Oct 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document (1) how to use local models, (2) which model classes are supported by prompt2model #356

Document (1) how to use local models, (2) which model classes are supported by prompt2model #356

chensimian commented Sep 20, 2023

neubig commented Sep 20, 2023

chensimian commented Sep 21, 2023

chensimian commented Sep 21, 2023 •

edited by viswavi

Loading

chensimian commented Sep 21, 2023

neubig commented Sep 21, 2023

chensimian commented Sep 21, 2023

neubig commented Sep 21, 2023

chensimian commented Sep 21, 2023

viswavi commented Sep 21, 2023

chensimian commented Sep 22, 2023

chensimian commented Oct 9, 2023

chensimian commented Oct 9, 2023

Document (1) how to use local models, (2) which model classes are supported by prompt2model #356

Document (1) how to use local models, (2) which model classes are supported by prompt2model #356

Comments

chensimian commented Sep 20, 2023

neubig commented Sep 20, 2023

chensimian commented Sep 21, 2023

chensimian commented Sep 21, 2023 • edited by viswavi Loading

chensimian commented Sep 21, 2023

neubig commented Sep 21, 2023

chensimian commented Sep 21, 2023

neubig commented Sep 21, 2023

chensimian commented Sep 21, 2023

viswavi commented Sep 21, 2023

chensimian commented Sep 22, 2023

chensimian commented Oct 9, 2023

chensimian commented Oct 9, 2023

chensimian commented Sep 21, 2023 •

edited by viswavi

Loading