A question about the training processes for the naflex versions v.s. fixed-resolution versions

by qijimrc - opened 2 days ago

2 days ago

Thank you for your great work! After reading the paper, I have a question that confuses me. Did only the NaFlex versions (e.g., siglip2-so400m-patch16-naflex) undergo training with the distillation loss and masked prediction loss as described in Section 2.3? And were the other fixed-resolution versions (like siglip2-so400m-patch16-384, ...) only further trained from the checkpoint of the training described in Section 2.2?
Thank you very much!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment