Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 10 days ago • 296
Iterative Object Count Optimization for Text-to-image Diffusion Models Paper • 2408.11721 • Published Aug 21, 2024 • 6
Iterative Object Count Optimization for Text-to-image Diffusion Models Paper • 2408.11721 • Published Aug 21, 2024 • 6 • 2
timm/tiny_vit_21m_512.dist_in22k_ft_in1k Image Classification • Updated about 1 month ago • 2.35k • 2
Discriminative Class Tokens for Text-to-Image Diffusion Models Paper • 2303.17155 • Published Mar 30, 2023 • 1
Discriminative Class Tokens for Text-to-Image Diffusion Models Paper • 2303.17155 • Published Mar 30, 2023 • 1
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation Paper • 2309.16429 • Published Sep 28, 2023 • 11
AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation Paper • 2305.13050 • Published May 22, 2023 • 3