view post Post 2754 BIG release by DeepSeek AIπ₯π₯π₯DeepSeek-R1 & DeepSeek-R1-Zero: two 660B reasoning models are here, alongside 6 distilled dense models (based on Llama & Qwen) for the community!https://huggingface.co/deepseek-ai deepseek-ai/DeepSeek-R1β¨ MIT License : enabling distillation for custom models β¨ 32B & 70B models match OpenAI o1-mini in multiple capabilitiesβ¨ API live now! Access Chain of Thought reasoning with model='deepseek-reasoner' See translation π₯ 15 15 π§ 6 6 π 2 2 + Reply
view article Article πΊπ¦ββ¬ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs By wolfram β’ Dec 4, 2024 β’ 76