You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
两代码的区别造成 RuntimeError: Error(s) in loading state_dict for MPLUG: size mismatch for visual_encoder.visual.positional_embedding: copying a param with shape torch.Size([x, 768]) from checkpoint, the shape in current model is torch.Size([x, 1024]).的错误。
如有错误敬请指正
The text was updated successfully, but these errors were encountered:
AliceMind/mPLUG
/caption_mplug.py中234-239行中,无论是visual encoder用了ViT-B-16还是ViT-L-14,在根据输入图像分辨率更改positional embedding时,创建的pos_embedd的dim=1都是768。
而在初始化MPLUG模型中初始化CLIP的时,即调用AliceMind/models
/visual_transformers.py中初始化CLIP代码时,positional embedding的dim=1的值时根据visual encoder用了ViT-B-16还是ViT-L-14而决定的,B-16时dim=1为768,L-14时dim=1为1024。
两代码的区别造成
RuntimeError: Error(s) in loading state_dict for MPLUG: size mismatch for visual_encoder.visual.positional_embedding: copying a param with shape torch.Size([x, 768]) from checkpoint, the shape in current model is torch.Size([x, 1024]).
的错误。如有错误敬请指正
The text was updated successfully, but these errors were encountered: