HuggingFaceTB/SmolVLM2-500M-Video-Instruct Video-Text-to-Text โข Updated 4 days ago โข 3.51k โข 37
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition โข Updated 2 days ago โข 19.6k โข 638
Running 279 279 Kokoro Text-to-Speech (WebGPU) ๐ฃ High-quality speech synthesis powered by Kokoro TTS
mlx-community/SmolVLM2-500M-Video-Instruct-mlx Video-Text-to-Text โข Updated 10 days ago โข 460 โข 10
FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations Paper โข 2411.10818 โข Published Nov 16, 2024 โข 25
StyleDrop: Text-to-Image Generation in Any Style Paper โข 2306.00983 โข Published Jun 1, 2023 โข 7
Kosmos-2: Grounding Multimodal Large Language Models to the World Paper โข 2306.14824 โข Published Jun 26, 2023 โข 34