Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetuning Error #454

Open
naimavahab opened this issue Jul 29, 2024 · 0 comments
Open

Finetuning Error #454

naimavahab opened this issue Jul 29, 2024 · 0 comments

Comments

@naimavahab
Copy link

After loading my checkpoint into sequence classification, getting this error:
Missing key(s) in state_dict: "model.bert.pooler.dense.weight", "model.bert.pooler.dense.bias", "model.classifier.weight", "model.classifier.bias".
Unexpected key(s) in state_dict: "model.cls.predictions.decoder.bias", "model.cls.predictions.decoder.weight", "model.cls.predictions.transform.LayerNorm.bias", "model.cls.predictions.transform.LayerNorm.weight", "model.cls.predictions.transform.dense.bias", "model.cls.predictions.transform.dense.weight"

My model keys are :
['model.bert.embeddings.LayerNorm.bias', 'model.bert.embeddings.LayerNorm.weight', 'model.bert.embeddings.token_type_embeddings.weight', 'model.bert.embeddings.word_embeddings.weight', 'model.bert.encoder.layer.0.attention.output.LayerNorm.bias', 'model.bert.encoder.layer.0.attention.output.LayerNorm.weight', 'model.bert.encoder.layer.0.attention.output.dense.bias', 'model.bert.encoder.layer.0.attention.output.dense.weight', 'model.bert.encoder.layer.0.attention.self.Wqkv.bias', 'model.bert.encoder.layer.0.attention.self.Wqkv.weight', 'model.bert.encoder.layer.0.mlp.gated_layers.weight', 'model.bert.encoder.layer.0.mlp.layernorm.bias', 'model.bert.encoder.layer.0.mlp.layernorm.weight', 'model.bert.encoder.layer.0.mlp.wo.bias', 'model.bert.encoder.layer.0.mlp.wo.weight', 'model.bert.encoder.layer.1.attention.output.LayerNorm.bias', 'model.bert.encoder.layer.1.attention.output.LayerNorm.weight', 'model.bert.encoder.layer.1.attention.output.dense.bias', 'model.bert.encoder.layer.1.attention.output.dense.weight', 'model.bert.encoder.layer.1.attention.self.Wqkv.bias', 'model.bert.encoder.layer.1.attention.self.Wqkv.weight', 'model.bert.encoder.layer.1.mlp.gated_layers.weight', 'model.bert.encoder.layer.1.mlp.layernorm.bias', 'model.bert.encoder.layer.1.mlp.layernorm.weight', 'model.bert.encoder.layer.1.mlp.wo.bias', 'model.bert.encoder.layer.1.mlp.wo.weight', 'model.bert.encoder.layer.10.attention.output.LayerNorm.bias', 'model.bert.encoder.layer.10.attention.output.LayerNorm.weight', 'model.bert.encoder.layer.10.attention.output.dense.bias', 'model.bert.encoder.layer.10.attention.output.dense.weight', 'model.bert.encoder.layer.10.attention.self.Wqkv.bias', 'model.bert.encoder.layer.10.attention.self.Wqkv.weight', 'model.bert.encoder.layer.10.mlp.gated_layers.weight', 'model.bert.encoder.layer.10.mlp.layernorm.bias', 'model.bert.encoder.layer.10.mlp.layernorm.weight', 'model.bert.encoder.layer.10.mlp.wo.bias', 'model.bert.encoder.layer.10.mlp.wo.weight', 'model.bert.encoder.layer.11.attention.output.LayerNorm.bias', 'model.bert.encoder.layer.11.attention.output.LayerNorm.weight', 'model.bert.encoder.layer.11.attention.output.dense.bias', 'model.bert.encoder.layer.11.attention.output.dense.weight', 'model.bert.encoder.layer.11.attention.self.Wqkv.bias', 'model.bert.encoder.layer.11.attention.self.Wqkv.weight', 'model.bert.encoder.layer.11.mlp.gated_layers.weight', 'model.bert.encoder.layer.11.mlp.layernorm.bias', 'model.bert.encoder.layer.11.mlp.layernorm.weight', 'model.bert.encoder.layer.11.mlp.wo.bias', 'model.bert.encoder.layer.11.mlp.wo.weight', 'model.bert.encoder.layer.2.attention.output.LayerNorm.bias', 'model.bert.encoder.layer.2.attention.output.LayerNorm.weight', 'model.bert.encoder.layer.2.attention.output.dense.bias', 'model.bert.encoder.layer.2.attention.output.dense.weight', 'model.bert.encoder.layer.2.attention.self.Wqkv.bias', 'model.bert.encoder.layer.2.attention.self.Wqkv.weight', 'model.bert.encoder.layer.2.mlp.gated_layers.weight', 'model.bert.encoder.layer.2.mlp.layernorm.bias', 'model.bert.encoder.layer.2.mlp.layernorm.weight', 'model.bert.encoder.layer.2.mlp.wo.bias', 'model.bert.encoder.layer.2.mlp.wo.weight', 'model.bert.encoder.layer.3.attention.output.LayerNorm.bias', 'model.bert.encoder.layer.3.attention.output.LayerNorm.weight', 'model.bert.encoder.layer.3.attention.output.dense.bias', 'model.bert.encoder.layer.3.attention.output.dense.weight', 'model.bert.encoder.layer.3.attention.self.Wqkv.bias', 'model.bert.encoder.layer.3.attention.self.Wqkv.weight', 'model.bert.encoder.layer.3.mlp.gated_layers.weight', 'model.bert.encoder.layer.3.mlp.layernorm.bias', 'model.bert.encoder.layer.3.mlp.layernorm.weight', 'model.bert.encoder.layer.3.mlp.wo.bias', 'model.bert.encoder.layer.3.mlp.wo.weight', 'model.bert.encoder.layer.4.attention.output.LayerNorm.bias', 'model.bert.encoder.layer.4.attention.output.LayerNorm.weight', 'model.bert.encoder.layer.4.attention.output.dense.bias', 'model.bert.encoder.layer.4.attention.output.dense.weight', 'model.bert.encoder.layer.4.attention.self.Wqkv.bias', 'model.bert.encoder.layer.4.attention.self.Wqkv.weight', 'model.bert.encoder.layer.4.mlp.gated_layers.weight', 'model.bert.encoder.layer.4.mlp.layernorm.bias', 'model.bert.encoder.layer.4.mlp.layernorm.weight', 'model.bert.encoder.layer.4.mlp.wo.bias', 'model.bert.encoder.layer.4.mlp.wo.weight', 'model.bert.encoder.layer.5.attention.output.LayerNorm.bias', 'model.bert.encoder.layer.5.attention.output.LayerNorm.weight', 'model.bert.encoder.layer.5.attention.output.dense.bias', 'model.bert.encoder.layer.5.attention.output.dense.weight', 'model.bert.encoder.layer.5.attention.self.Wqkv.bias', 'model.bert.encoder.layer.5.attention.self.Wqkv.weight', 'model.bert.encoder.layer.5.mlp.gated_layers.weight', 'model.bert.encoder.layer.5.mlp.layernorm.bias', 'model.bert.encoder.layer.5.mlp.layernorm.weight', 'model.bert.encoder.layer.5.mlp.wo.bias', 'model.bert.encoder.layer.5.mlp.wo.weight', 'model.bert.encoder.layer.6.attention.output.LayerNorm.bias', 'model.bert.encoder.layer.6.attention.output.LayerNorm.weight', 'model.bert.encoder.layer.6.attention.output.dense.bias', 'model.bert.encoder.layer.6.attention.output.dense.weight', 'model.bert.encoder.layer.6.attention.self.Wqkv.bias', 'model.bert.encoder.layer.6.attention.self.Wqkv.weight', 'model.bert.encoder.layer.6.mlp.gated_layers.weight', 'model.bert.encoder.layer.6.mlp.layernorm.bias', 'model.bert.encoder.layer.6.mlp.layernorm.weight', 'model.bert.encoder.layer.6.mlp.wo.bias', 'model.bert.encoder.layer.6.mlp.wo.weight', 'model.bert.encoder.layer.7.attention.output.LayerNorm.bias', 'model.bert.encoder.layer.7.attention.output.LayerNorm.weight', 'model.bert.encoder.layer.7.attention.output.dense.bias', 'model.bert.encoder.layer.7.attention.output.dense.weight', 'model.bert.encoder.layer.7.attention.self.Wqkv.bias', 'model.bert.encoder.layer.7.attention.self.Wqkv.weight', 'model.bert.encoder.layer.7.mlp.gated_layers.weight', 'model.bert.encoder.layer.7.mlp.layernorm.bias', 'model.bert.encoder.layer.7.mlp.layernorm.weight', 'model.bert.encoder.layer.7.mlp.wo.bias', 'model.bert.encoder.layer.7.mlp.wo.weight', 'model.bert.encoder.layer.8.attention.output.LayerNorm.bias', 'model.bert.encoder.layer.8.attention.output.LayerNorm.weight', 'model.bert.encoder.layer.8.attention.output.dense.bias', 'model.bert.encoder.layer.8.attention.output.dense.weight', 'model.bert.encoder.layer.8.attention.self.Wqkv.bias', 'model.bert.encoder.layer.8.attention.self.Wqkv.weight', 'model.bert.encoder.layer.8.mlp.gated_layers.weight', 'model.bert.encoder.layer.8.mlp.layernorm.bias', 'model.bert.encoder.layer.8.mlp.layernorm.weight', 'model.bert.encoder.layer.8.mlp.wo.bias', 'model.bert.encoder.layer.8.mlp.wo.weight', 'model.bert.encoder.layer.9.attention.output.LayerNorm.bias', 'model.bert.encoder.layer.9.attention.output.LayerNorm.weight', 'model.bert.encoder.layer.9.attention.output.dense.bias', 'model.bert.encoder.layer.9.attention.output.dense.weight', 'model.bert.encoder.layer.9.attention.self.Wqkv.bias', 'model.bert.encoder.layer.9.attention.self.Wqkv.weight', 'model.bert.encoder.layer.9.mlp.gated_layers.weight', 'model.bert.encoder.layer.9.mlp.layernorm.bias', 'model.bert.encoder.layer.9.mlp.layernorm.weight', 'model.bert.encoder.layer.9.mlp.wo.bias', 'model.bert.encoder.layer.9.mlp.wo.weight', 'model.cls.predictions.decoder.bias', 'model.cls.predictions.decoder.weight', 'model.cls.predictions.transform.LayerNorm.bias', 'model.cls.predictions.transform.LayerNorm.weight', 'model.cls.predictions.transform.dense.bias', 'model.cls.predictions.transform.dense.weight']

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant