A question about the training processes for the naflex versions v.s. fixed-resolution versions
#4
by
qijimrc
- opened
Thank you for your great work! After reading the paper, I have a question that confuses me. Did only the NaFlex versions (e.g., siglip2-so400m-patch16-naflex) undergo training with the distillation loss and masked prediction loss as described in Section 2.3? And were the other fixed-resolution versions (like siglip2-so400m-patch16-384, ...) only further trained from the checkpoint of the training described in Section 2.2?
Thank you very much!