You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I try to load pretrained pth of llava: hub/llava-phi-3-mini-pth/model.pth. And I got this strange error:
used deepspeed zerospeed3 and flash-attn.
RuntimeError: Error(s) in loading state_dict for LLaVAModel:
size mismatch for llm.model.embed_tokens.weight: copying a param with shape torch.Size([32064, 3072]) from checkpoint, the shape in current model is torch.Size([0]).
size mismatch for llm.model.layers.0.self_attn.o_proj.weight: copying a param with shape torch.Size([3072, 3072]) from checkpoint, the shape in current model is torch.Size([0]).
any clue?
thx!
The text was updated successfully, but these errors were encountered:
I try to load pretrained pth of llava: hub/llava-phi-3-mini-pth/model.pth. And I got this strange error:
any clue?
thx!
The text was updated successfully, but these errors were encountered: