DeepSeek-R1 Collection by deepseek-ai 15 days ago 366 deepseek-ai/DeepSeek-R1 Text Generation • Updated 3 days ago • 1.04M • • 6.46k deepseek-ai/DeepSeek-R1-Zero Text Generation • Updated 3 days ago • 22k • 690 deepseek-ai/DeepSeek-R1-Distill-Llama-70B Text Generation • Updated 3 days ago • 157k • 448 deepseek-ai/DeepSeek-R1-Distill-Qwen-32B Text Generation • Updated 3 days ago • 358k • • 847
Qwen2.5-VL Vision-language model series based on Qwen2.5 Collection by Qwen 9 days ago 313 Qwen/Qwen2.5-VL-3B-Instruct Image-Text-to-Text • Updated about 5 hours ago • 68k • 148 Qwen/Qwen2.5-VL-7B-Instruct Image-Text-to-Text • Updated 8 days ago • 175k • 301 Qwen/Qwen2.5-VL-72B-Instruct Image-Text-to-Text • Updated 8 days ago • 22.8k • 207
DeepSeek R1 (All Versions) DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. Collection by unsloth about 7 hours ago 139 unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF Updated 6 days ago • 286k • 191 unsloth/DeepSeek-R1-GGUF Text Generation • Updated 5 days ago • 301k • 515 unsloth/DeepSeek-R1-Distill-Qwen-32B-GGUF Updated 10 days ago • 168k • 85 unsloth/DeepSeek-R1-Distill-Qwen-7B-GGUF Updated 10 days ago • 111k • 63
Tulu 3 Models All models released with Tulu 3 -- state of the art open post-training recipes. Collection by allenai 6 days ago 85 allenai/Llama-3.1-Tulu-3-8B Text Generation • Updated 6 days ago • 12.7k • 137 allenai/Llama-3.1-Tulu-3-70B Text Generation • Updated 6 days ago • 7.72k • 50 allenai/Llama-3.1-Tulu-3-405B Text Generation • Updated 6 days ago • 469 • 83 allenai/Llama-3.1-Tulu-3-8B-DPO Text Generation • Updated 6 days ago • 28.5k • 22
Dobby-Mini Preview models for the upcoming family of Dobby LLMs. Collection by SentientAGI 5 days ago 47 SentientAGI/Dobby-Mini-Unhinged-Llama-3.1-8B Text Generation • Updated about 2 hours ago • 1.09k • 17 SentientAGI/Dobby-Mini-Leashed-Llama-3.1-8B Text Generation • Updated about 2 hours ago • 95 • 7 SentientAGI/Dobby-Mini-Unhinged-Llama-3.1-8B_GGUF Updated 11 days ago • 298 • 3 SentientAGI/Dobby-Mini-Leashed-Llama-3.1-8B_GGUF Updated 11 days ago • 84 • 1
SentientAGI/Dobby-Mini-Unhinged-Llama-3.1-8B Text Generation • Updated about 2 hours ago • 1.09k • 17
Reasoning Datasets Distilled synthetic Reasoning datasets Collection by philschmid 2 days ago 42 ServiceNow-AI/R1-Distill-SFT Viewer • Updated 7 days ago • 1.85M • 1.48k • 147 open-thoughts/OpenThoughts-114k Viewer • Updated 6 days ago • 114k • 19.9k • 250 bespokelabs/Bespoke-Stratos-17k Viewer • Updated 5 days ago • 16.7k • 29.9k • 195 EricLu/SCP-116K Viewer • Updated 7 days ago • 117k • 177 • 35
🧠 Reasoning datasets Datasets with reasoning traces for math and code released by the community Collection by open-r1 4 days ago 31 bespokelabs/Bespoke-Stratos-17k Viewer • Updated 5 days ago • 16.7k • 29.9k • 195 open-thoughts/OpenThoughts-114k Viewer • Updated 6 days ago • 114k • 19.9k • 250 open-r1/OpenThoughts-114k-math Viewer • Updated 5 days ago • 89.1k • 526 • 34 PrimeIntellect/NuminaMath-QwQ-CoT-5M Viewer • Updated 13 days ago • 5.14M • 1.86k • 38
DeepSeek-V3 Collection by deepseek-ai 30 days ago 172 deepseek-ai/DeepSeek-V3-Base Updated 11 days ago • 27.6k • 1.5k deepseek-ai/DeepSeek-V3 Text Generation • Updated 11 days ago • 930k • • 3.1k DeepSeek-V3 Technical Report Paper • 2412.19437 • Published Dec 27, 2024 • 47
Qwen2.5-1M The long-context version of Qwen2.5, supporting 1M-token context lengths Collection by Qwen 9 days ago 97 Qwen/Qwen2.5-14B-Instruct-1M Text Generation • Updated 6 days ago • 10.7k • 215 Qwen/Qwen2.5-7B-Instruct-1M Text Generation • Updated 6 days ago • 25k • 174
Qwen2.5 Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. Collection by Qwen Nov 28, 2024 498 Running 627 627 Qwen2.5 🚀 Chat with Qwen, a helpful AI assistant Qwen/Qwen2.5-0.5B Text Generation • Updated Sep 25, 2024 • 372k • 177 Qwen/Qwen2.5-0.5B-Instruct Text Generation • Updated Sep 25, 2024 • 696k • 198 Qwen/Qwen2.5-1.5B Text Generation • Updated Oct 8, 2024 • 371k • 61