RuntimeError: Error(s) in loading state_dict for CLIPVisionModelWithProjection: Unexpected key(s) in state_dict: "vision_model.embeddings.position_ids".

#2
by tatsuMura - opened

Hi, Nice work!
I ran this code on google colab, but I met the following error in case of loading clip image encoder.

RuntimeError: Error(s) in loading state_dict for CLIPVisionModelWithProjection:
    Unexpected key(s) in state_dict: "vision_model.embeddings.position_ids". 

There is something wrong with image encoder weights??
Please tell me how to solve this problem.

Please check whether the package versions are right.

Thanks!
I reinstalled the package corresponding to requirements.txt and worked well.
I could resolve!

tatsuMura changed discussion status to closed

Sign up or log in to comment