A Diffusion Model for Video Inpainting
Vision Transformer Attention Visualization
FitDiT is a high-fidelity virtual try-on model.
Paligemma2 Detection with Supervision