view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google 7 days ago β’ 58
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita π₯ 8 days ago β’ 89
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature Paper β’ 2501.07171 β’ Published Jan 13 β’ 50
Scaling Pre-training to One Hundred Billion Data for Vision Language Models Paper β’ 2502.07617 β’ Published 14 days ago β’ 28
Qwen2-VL Collection Vision-language model series based on Qwen2 β’ 16 items β’ Updated Dec 6, 2024 β’ 207
Hibiki fr-en Collection Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. β’ 5 items β’ Updated 19 days ago β’ 50
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M β’ 16 items β’ Updated 5 days ago β’ 240
Stable Flow: Vital Layers for Training-Free Image Editing Paper β’ 2411.14430 β’ Published Nov 21, 2024 β’ 22
Diffusers Guides Collection Collection of diffusers guides and their respective spaces β’ 2 items β’ Updated Oct 9, 2024 β’ 2
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. β’ 29 items β’ Updated 18 days ago β’ 199