15.7.1 加载预训练 bert.base 模型报错 #1261

KissMyLady · 2023-04-05T14:57:50Z

加载代码

devices = d2l.try_all_gpus()
# devices = [torch.device('cpu')]


# 加载词向量
bert, vocab = load_pretrained_model('bert.base', 
                                    num_hiddens=256, 
                                    ffn_num_hiddens=512, 
                                    num_heads=4,
                                    num_layers=2, 
                                    dropout=0.1, 
                                    max_len=512, 
                                    devices=devices)

# 加大参数量
# bert, vocab = load_pretrained_model('bert.base', 
#                                     num_hiddens=768, 
#                                     ffn_num_hiddens=3072, 
#                                     num_heads=12,
#                                     num_layers=12, 
#                                     dropout=0.1, 
#                                     max_len=512, 
#                                     devices=devices)

报错截图

报错节选

RuntimeError: Error(s) in loading state_dict for BERTModel:
Unexpected key(s) in state_dict: "encoder.blks.2.attention.W_q.weight", "encoder.blks.2.attention.W_q.bias", "encoder.blks.2.attention.W_k.weight", "encoder.blks.2.attention.W_k.bias", "encoder.blks.2.attention.W_v.weight", "encoder.blks.2.attention.W_v.bias", "encoder.blks.2.attention.W_o.weight", "encoder.blks.2.attention.W_o.bias", "encoder.blks.2.addnorm1.ln.weight", "encoder.blks.2.addnorm1.ln.bias", "encoder.blks.2.ffn.dense1.weight", "encoder.blks.2.ffn.dense1.bias", "encoder.blks.2.ffn.dense2.weight", "encoder.blks.2.ffn.dense2.bias", "encoder.blks.2.addnorm2.ln.weight", "encoder.blks.2.addnorm2.ln.bias", "encoder.blks.3.attention.W_q.weight", "encoder.blks.3.attention.W_q.bias", "encoder.blks.3.attention.W_k.weight", "encoder.blks.3.attention.W_k.bias", "encoder.blks.3.attention.W_v.weight", "encoder.blks.3.attention.W_v.bias", "encoder.blks.3.attention.W_o.weight", "encoder.blks.3.attention.W_o.bias", "encoder.blks.3.addnorm1.ln.weight", "encoder.blks.3.addnorm1.ln.bias", "encoder.blks.3.ffn.dense1.weight", "encoder.blks.3.ffn.dense1.bias", "encoder.blks.3.ffn.dense2.weight", "encoder.blks.3.ffn.dense2.bias", "encoder.blks.3.addnorm2.ln.weight", "encoder.blks.3.addnorm2.ln.bias", "encoder.blks.4.attention.W_q.weight", "encoder.blks.4.attention.W_q.bias", "encoder.blks.4.attention.W_k.weight", "encoder.blks.4.attention.W_k.bias", "encoder.blks.4.attention.W_v.weight", "encoder.blks.4.attention.W_v.bias", "encoder.blks.4.attention.W_o.weight", "encoder.blks.4.attention.W_o.bias", "encoder.blks.4.addnorm1.ln.weight", "encoder.blks.4.addnorm1.ln.bias", "encoder.blks.4.ffn.dense1.weight",

v2.0.0-alpha0

v2.0.0-alpha0 style hotfix

[hotfix] Support SageMaker and Colab

[hotfix] install d2l-zh release branch for colab

[hotfix] fix d2l-zh v2 link

[hotfix] sync bib

[hotfix] add zhibo link

v2.0.0-alpha1 release

Minor release: add 13.1, 13.2, 13.13, 13.14

Add CV object detection part

Add 13.9--13.12

Build PyTorch pdf

Release 2.0.0-alpha2

2.0.0-alpha2 hotfix

Add 14.1--14.4

Disable untranslated part of ch14, add MT of ch11

Add Complete Ch14

Pre-release for v2.0.0-beta0

[hotfix] Add d2l-zh-2.0.0.zip

Release v2.0.0-beta0

[minor fix]

[Bug and typo fix]

Sync d2l-en v0.17.3

Post-edits till Mar 14 and lib upgrading

Sync d2l-en v0.17.5

Release 2.0.0-beta1

2.0.0版发布

[No from-scratch eval] Add 2e links

Disable deploy slides

Update best-seller ranking

KissMyLady · 2023-04-25T22:08:37Z

bug 问题继续

根据报错信息. 进一步调整了输入的模型超参数

# bert模型
bert = BERTModel(vocab_size=60005,
                 num_hiddens=768,
                 norm_shape=[768],
                 
                 ffn_num_input=768,
                 ffn_num_hiddens=3072,
                 
                 num_heads=4,
                 num_layers=2,
                 dropout=0.2,
                 max_len=512,
                 
                 key_size=768,
                 query_size=768,
                 value_size=768,
                 
                 hid_in_features=768,
                 mlm_in_features=768,
                 nsp_in_features=768
                )

加载模型

base_path = r'this is bert abs path'

bert.load_state_dict(torch.load(data_dir))

底部的参数对不上问题没有了, 但是顶部的 keys in state_dict 仍然存在

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[70], line 2
      1 # 加载
----> 2 bert.load_state_dict(torch.load(data_dir))

File ~/.virtualenvs/dl-pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py:1671, in Module.load_state_dict(self, state_dict, strict)
   1666         error_msgs.insert(
   1667             0, 'Missing key(s) in state_dict: {}. '.format(
   1668                 ', '.join('"{}"'.format(k) for k in missing_keys)))
   1670 if len(error_msgs) > 0:
-> 1671     raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
   1672                        self.__class__.__name__, "\n\t".join(error_msgs)))
   1673 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for BERTModel:
	Unexpected key(s) in state_dict: "encoder.blks.2.attention.W_q.weight", "encoder.blks.2.attention.W_q.bias", "encoder.blks.2.attention.W_k.weight", "encoder.blks.2.attention.W_k.bias", "encoder.blks.2.attention.W_v.weight", "encoder.blks.2.attention.W_v.bias", "encoder.blks.2.attention.W_o.weight", "encoder.blks.2.attention.W_o.bias", "encoder.blks.2.addnorm1.ln.weight", "encoder.blks.2.addnorm1.ln.bias", "encoder.blks.2.ffn.dense1.weight", "encoder.blks.2.ffn.dense1.bias", "encoder.blks.2.ffn.dense2.weight", "encoder.blks.2.ffn.dense2.bias", "encoder.blks.2.addnorm2.ln.weight", "encoder.blks.2.addnorm2.ln.bias", "encoder.blks.3.attention.W_q.weight", "encoder.blks.3.attention.W_q.bias", "encoder.blks.3.attention.W_k.weight", "encoder.blks.3.attention.W_k.bias", "encoder.blks.3.attention.W_v.weight", "encoder.blks.3.attention.W_v.bias", "encoder.blks.3.attention.W_o.weight", "encoder.blks.3.attention.W_o.bias", "encoder.blks.3.addnorm1.ln.weight", "encoder.blks.3.addnorm1.ln.bias", "encoder.blks.3.ffn.dense1.weight", "encoder.blks.3.ffn.dense1.bias", "encoder.blks.3.ffn.dense2.weight", "encoder.blks.3.ffn.dense2.bias", "encoder.blks.3.addnorm2.ln.weight", "encoder.blks.3.addnorm2.ln.bias", "encoder.blks.4.attention.W_q.weight", "encoder.blks.4.attention.W_q.bias", "encoder.blks.4.attention.W_k.weight", "encoder.blks.4.attention.W_k.bias", "encoder.blks.4.attention.W_v.weight", "encoder.blks.4.attention.W_v.bias", "encoder.blks.4.attention.W_o.weight", "encoder.blks.4.attention.W_o.bias", "encoder.blks.4.addnorm1.ln.weight", "encoder.blks.4.addnorm1.ln.bias", "encoder.blks.4.ffn.dense1.weight", "encoder.blks.4.ffn.dense1.bias", "encoder.blks.4.ffn.dense2.weight", "encoder.blks.4.ffn.dense2.bias", "encoder.blks.4.addnorm2.ln.weight", "encoder.blks.4.addnorm2.ln.bias", "encoder.blks.5.attention.W_q.weight", "encoder.blks.5.attention.W_q.bias", "encoder.blks.5.attention.W_k.weight", "encoder.blks.5.attention.W_k.bias", "encoder.blks.5.attention.W_v.weight", "encoder.blks.5.attention.W_v.bias", "encoder.blks.5.attention.W_o.weight", "encoder.blks.5.attention.W_o.bias", "encoder.blks.5.addnorm1.ln.weight", "encoder.blks.5.addnorm1.ln.bias", "encoder.blks.5.ffn.dense1.weight", "encoder.blks.5.ffn.dense1.bias", "encoder.blks.5.ffn.dense2.weight", "encoder.blks.5.ffn.dense2.bias", "encoder.blks.5.addnorm2.ln.weight", "encoder.blks.5.addnorm2.ln.bias", "encoder.blks.6.attention.W_q.weight", "encoder.blks.6.attention.W_q.bias", "encoder.blks.6.attention.W_k.weight", "encoder.blks.6.attention.W_k.bias", "encoder.blks.6.attention.W_v.weight", "encoder.blks.6.attention.W_v.bias", "encoder.blks.6.attention.W_o.weight", "encoder.blks.6.attention.W_o.bias", "encoder.blks.6.addnorm1.ln.weight", "encoder.blks.6.addnorm1.ln.bias", "encoder.blks.6.ffn.dense1.weight", "encoder.blks.6.ffn.dense1.bias", "encoder.blks.6.ffn.dense2.weight", "encoder.blks.6.ffn.dense2.bias", "encoder.blks.6.addnorm2.ln.weight", "encoder.blks.6.addnorm2.ln.bias", "encoder.blks.7.attention.W_q.weight", "encoder.blks.7.attention.W_q.bias", "encoder.blks.7.attention.W_k.weight", "encoder.blks.7.attention.W_k.bias", "encoder.blks.7.attention.W_v.weight", "encoder.blks.7.attention.W_v.bias", "encoder.blks.7.attention.W_o.weight", "encoder.blks.7.attention.W_o.bias", "encoder.blks.7.addnorm1.ln.weight", "encoder.blks.7.addnorm1.ln.bias", "encoder.blks.7.ffn.dense1.weight", "encoder.blks.7.ffn.dense1.bias", "encoder.blks.7.ffn.dense2.weight", "encoder.blks.7.ffn.dense2.bias", "encoder.blks.7.addnorm2.ln.weight", "encoder.blks.7.addnorm2.ln.bias", "encoder.blks.8.attention.W_q.weight", "encoder.blks.8.attention.W_q.bias", "encoder.blks.8.attention.W_k.weight", "encoder.blks.8.attention.W_k.bias", "encoder.blks.8.attention.W_v.weight", "encoder.blks.8.attention.W_v.bias", "encoder.blks.8.attention.W_o.weight", "encoder.blks.8.attention.W_o.bias", "encoder.blks.8.addnorm1.ln.weight", "encoder.blks.8.addnorm1.ln.bias", "encoder.blks.8.ffn.dense1.weight", "encoder.blks.8.ffn.dense1.bias", "encoder.blks.8.ffn.dense2.weight", "encoder.blks.8.ffn.dense2.bias", "encoder.blks.8.addnorm2.ln.weight", "encoder.blks.8.addnorm2.ln.bias", "encoder.blks.9.attention.W_q.weight", "encoder.blks.9.attention.W_q.bias", "encoder.blks.9.attention.W_k.weight", "encoder.blks.9.attention.W_k.bias", "encoder.blks.9.attention.W_v.weight", "encoder.blks.9.attention.W_v.bias", "encoder.blks.9.attention.W_o.weight", "encoder.blks.9.attention.W_o.bias", "encoder.blks.9.addnorm1.ln.weight", "encoder.blks.9.addnorm1.ln.bias", "encoder.blks.9.ffn.dense1.weight", "encoder.blks.9.ffn.dense1.bias", "encoder.blks.9.ffn.dense2.weight", "encoder.blks.9.ffn.dense2.bias", "encoder.blks.9.addnorm2.ln.weight", "encoder.blks.9.addnorm2.ln.bias", "encoder.blks.10.attention.W_q.weight", "encoder.blks.10.attention.W_q.bias", "encoder.blks.10.attention.W_k.weight", "encoder.blks.10.attention.W_k.bias", "encoder.blks.10.attention.W_v.weight", "encoder.blks.10.attention.W_v.bias", "encoder.blks.10.attention.W_o.weight", "encoder.blks.10.attention.W_o.bias", "encoder.blks.10.addnorm1.ln.weight", "encoder.blks.10.addnorm1.ln.bias", "encoder.blks.10.ffn.dense1.weight", "encoder.blks.10.ffn.dense1.bias", "encoder.blks.10.ffn.dense2.weight", "encoder.blks.10.ffn.dense2.bias", "encoder.blks.10.addnorm2.ln.weight", "encoder.blks.10.addnorm2.ln.bias", "encoder.blks.11.attention.W_q.weight", "encoder.blks.11.attention.W_q.bias", "encoder.blks.11.attention.W_k.weight", "encoder.blks.11.attention.W_k.bias", "encoder.blks.11.attention.W_v.weight", "encoder.blks.11.attention.W_v.bias", "encoder.blks.11.attention.W_o.weight", "encoder.blks.11.attention.W_o.bias", "encoder.blks.11.addnorm1.ln.weight", "encoder.blks.11.addnorm1.ln.bias", "encoder.blks.11.ffn.dense1.weight", "encoder.blks.11.ffn.dense1.bias", "encoder.blks.11.ffn.dense2.weight", "encoder.blks.11.ffn.dense2.bias", "encoder.blks.11.addnorm2.ln.weight", "encoder.blks.11.addnorm2.ln.bias".

pre-Release 2.0.1 (To enable GitHub Actions)

AnirudhDagar · 2023-08-14T18:50:21Z

Closing this PR since we don't take external releases. Thanks!

github-actions · 2023-08-15T21:45:33Z

Job PR-1261-4b9a3a2 is done.
Check the results at http://preview.d2l.ai/d2l-zh/PR-1261/4b9a3a2

KissMyLady · 2023-08-17T23:58:41Z

ok, let's me close this issues

astonzhang and others added 30 commits March 8, 2021 14:32

Merge pull request #686 from d2l-ai/master

24158fb

v2.0.0-alpha0

Merge pull request #689 from d2l-ai/master

c68a39f

v2.0.0-alpha0 style hotfix

Merge pull request #694 from d2l-ai/master

c0cbd84

[hotfix] Support SageMaker and Colab

Merge pull request #695 from d2l-ai/master

a1e2a36

[hotfix] install d2l-zh release branch for colab

Merge pull request #696 from d2l-ai/master

87bd3f6

[hotfix] fix d2l-zh v2 link

Merge pull request #698 from d2l-ai/master

20ef4d2

[hotfix] sync bib

Merge pull request #699 from d2l-ai/master

3640548

[hotfix] add zhibo link

Merge pull request #803 from d2l-ai/master

dae3fda

v2.0.0-alpha1 release

Merge pull request #821 from d2l-ai/master

94dc7d0

Minor release: add 13.1, 13.2, 13.13, 13.14

Merge pull request #867 from d2l-ai/master

b177a44

Add CV object detection part

Merge pull request #885 from d2l-ai/master

fee3287

Add 13.9--13.12

Merge pull request #893 from d2l-ai/master

098eb16

Build PyTorch pdf

Merge pull request #911 from d2l-ai/master

35b0b2d

Release 2.0.0-alpha2

Merge pull request #913 from d2l-ai/master

7d60eae

2.0.0-alpha2 hotfix

Merge pull request #926 from d2l-ai/master

8770d10

Add 14.1--14.4

Merge pull request #927 from d2l-ai/master

b87af57

Disable untranslated part of ch14, add MT of ch11

Merge pull request #941 from d2l-ai/master

62c2723

Add Complete Ch14

Merge pull request #1026 from d2l-ai/master

0e3803d

Pre-release for v2.0.0-beta0

Merge pull request #1029 from d2l-ai/master

2d9071c

[hotfix] Add d2l-zh-2.0.0.zip

Merge pull request #1039 from d2l-ai/master

e51c411

Release v2.0.0-beta0

Merge pull request #1044 from d2l-ai/master

b3cacae

[minor fix]

Merge pull request #1077 from d2l-ai/master

e5cfa31

[Bug and typo fix]

Merge pull request #1078 from d2l-ai/master

7f205d2

Sync d2l-en v0.17.3

Merge pull request #1100 from d2l-ai/master

28e1114

Post-edits till Mar 14 and lib upgrading

Merge pull request #1123 from d2l-ai/master

450277f

Sync d2l-en v0.17.5

Merge pull request #1182 from d2l-ai/master

e42ebff

Release 2.0.0-beta1

Merge pull request #1232 from d2l-ai/master

3edee26

2.0.0版发布

Merge pull request #1241 from d2l-ai/master

c6288be

[No from-scratch eval] Add 2e links

Merge pull request #1242 from d2l-ai/master

3a5517b

Disable deploy slides

Merge pull request #1246 from d2l-ai/master

b29df2a

Update best-seller ranking

Merge pull request #1282 from d2l-ai/master

4b9a3a2

pre-Release 2.0.1 (To enable GitHub Actions)

AnirudhDagar closed this Aug 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

15.7.1 加载预训练 bert.base 模型报错 #1261

15.7.1 加载预训练 bert.base 模型报错 #1261

KissMyLady commented Apr 5, 2023

KissMyLady commented Apr 25, 2023

AnirudhDagar commented Aug 14, 2023

github-actions bot commented Aug 15, 2023

KissMyLady commented Aug 17, 2023

15.7.1 加载预训练 bert.base 模型报错 #1261

15.7.1 加载预训练 bert.base 模型报错 #1261

Conversation

KissMyLady commented Apr 5, 2023

报错截图

KissMyLady commented Apr 25, 2023

bug 问题继续

加载模型

AnirudhDagar commented Aug 14, 2023

github-actions bot commented Aug 15, 2023

KissMyLady commented Aug 17, 2023