config class for bert is not consistent #438

DanielWit · 2023-11-25T20:29:49Z

Hey I am trying to pull the model from huggingface repo using
AutoModelForMaskedLM.from_pretrained( 'mosaicml/mosaic-bert-base-seqlen-2048', trust_remote_code=True, revision='b7a0389')
(with revision param and without) I am getting the same error that goes like this:
ValueError: The model class you are passing has a config_classattribute that is not consistent with the config class you passed (model has <class 'transformers.models.bert.configuration_bert.BertConfig'> and you passed <class 'transformers_modules.mosaicml.mosaic-bert-base-seqlen-2048.b7a0389deadf7a7261a3e5e7ea0680d8ba12232f.configuration_bert.BertConfig'>. Fix one of those so they match!
Do you have any suggestion as to why this might be the case?

When I do this : BertModel.from_pretrained('mosaicml/mosaic-bert-base-seqlen-2048') It seem to work correctly although I am not sure if the flash attention will work correctly given this statement "This model requires that trust_remote_code=True be passed to the from_pretrained method. This is because we train using FlashAttention (Dao et al. 2022), which is not part of the transformers library and depends on Triton and some custom PyTorch code." in the model card, and class BertModel don't have parameter trust_remote_code.

The text was updated successfully, but these errors were encountered:

dakinggg · 2023-11-28T22:48:05Z

Looks like Hugging Face added some stricter checking at some point. If you go back to the transformers version this model was trained on (4.25.1) auto should work as expected. Otherwise you can load with BertModel as you've done and it should work (assuming you imported BertModel from this repo). I'll also try to get this fixed to work with later transformers versions.

jacobfulano · 2023-12-27T21:47:48Z

A quick fix here is to do get the config and then pass it in to AutoModelForMaskedLM.from_pretrained

import torch
import transformers
from transformers import AutoModelForMaskedLM, BertTokenizer, pipeline
from transformers import BertTokenizer, BertConfig

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') # MosaicBERT uses the standard BERT tokenizer

config = transformers.BertConfig.from_pretrained('mosaicml/mosaic-bert-base-seqlen-2048') # the config needs to be passed in
mosaicbert = AutoModelForMaskedLM.from_pretrained('mosaicml/mosaic-bert-base-seqlen-2048',config=config,trust_remote_code=True)

# To use this model directly for masked language modeling
mosaicbert_classifier = pipeline('fill-mask', model=mosaicbert, tokenizer=tokenizer,device="cpu")
mosaicbert_classifier("I [MASK] to the store yesterday.")

We're updating the documentation in https://huggingface.co/mosaicml/mosaic-bert-base

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config class for bert is not consistent #438

config class for bert is not consistent #438

DanielWit commented Nov 25, 2023

dakinggg commented Nov 28, 2023

jacobfulano commented Dec 27, 2023

config class for bert is not consistent #438

config class for bert is not consistent #438

Comments

DanielWit commented Nov 25, 2023

dakinggg commented Nov 28, 2023

jacobfulano commented Dec 27, 2023