Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config class for bert is not consistent #438

Open
DanielWit opened this issue Nov 25, 2023 · 2 comments
Open

config class for bert is not consistent #438

DanielWit opened this issue Nov 25, 2023 · 2 comments

Comments

@DanielWit
Copy link

Hey I am trying to pull the model from huggingface repo using
AutoModelForMaskedLM.from_pretrained( 'mosaicml/mosaic-bert-base-seqlen-2048', trust_remote_code=True, revision='b7a0389')
(with revision param and without) I am getting the same error that goes like this:
ValueError: The model class you are passing has a config_classattribute that is not consistent with the config class you passed (model has <class 'transformers.models.bert.configuration_bert.BertConfig'> and you passed <class 'transformers_modules.mosaicml.mosaic-bert-base-seqlen-2048.b7a0389deadf7a7261a3e5e7ea0680d8ba12232f.configuration_bert.BertConfig'>. Fix one of those so they match!
Do you have any suggestion as to why this might be the case?

When I do this : BertModel.from_pretrained('mosaicml/mosaic-bert-base-seqlen-2048') It seem to work correctly although I am not sure if the flash attention will work correctly given this statement "This model requires that trust_remote_code=True be passed to the from_pretrained method. This is because we train using FlashAttention (Dao et al. 2022), which is not part of the transformers library and depends on Triton and some custom PyTorch code." in the model card, and class BertModel don't have parameter trust_remote_code.

@dakinggg
Copy link
Collaborator

Looks like Hugging Face added some stricter checking at some point. If you go back to the transformers version this model was trained on (4.25.1) auto should work as expected. Otherwise you can load with BertModel as you've done and it should work (assuming you imported BertModel from this repo). I'll also try to get this fixed to work with later transformers versions.

@jacobfulano
Copy link
Contributor

A quick fix here is to do get the config and then pass it in to AutoModelForMaskedLM.from_pretrained

import torch
import transformers
from transformers import AutoModelForMaskedLM, BertTokenizer, pipeline
from transformers import BertTokenizer, BertConfig

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') # MosaicBERT uses the standard BERT tokenizer

config = transformers.BertConfig.from_pretrained('mosaicml/mosaic-bert-base-seqlen-2048') # the config needs to be passed in
mosaicbert = AutoModelForMaskedLM.from_pretrained('mosaicml/mosaic-bert-base-seqlen-2048',config=config,trust_remote_code=True)

# To use this model directly for masked language modeling
mosaicbert_classifier = pipeline('fill-mask', model=mosaicbert, tokenizer=tokenizer,device="cpu")
mosaicbert_classifier("I [MASK] to the store yesterday.")

We're updating the documentation in https://huggingface.co/mosaicml/mosaic-bert-base

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants