view article Article Mini-R1: Reproduce Deepseek R1 βaha momentβ a RL tutorial By open-r1 β’ 4 days ago β’ 22
view article Article **How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents** By Steveeeeeeen β’ 5 days ago β’ 12
Exploring the sustainable scaling of AI dilemma: A projective study of corporations' AI environmental impacts Paper β’ 2501.14334 β’ Published 11 days ago β’ 15
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction Paper β’ 2501.06282 β’ Published 24 days ago β’ 42
view article Article Yay! Organizations can now publish blog Articles By huggingface β’ 14 days ago β’ 30
view article Article MiniMax-01 is Now Open-Source: Scaling Lightning Attention for the AI Agent Era By MiniMax-AI β’ 20 days ago β’ 40
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M β’ 15 items β’ Updated Dec 22, 2024 β’ 211
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper β’ 2501.04519 β’ Published 26 days ago β’ 252
DolphinLabeled Datasets Collection Eric Hartford has added labels to help you filter datasets, for your pleasure. β’ 5 items β’ Updated 28 days ago β’ 11
YuLan-Mini: An Open Data-efficient Language Model Paper β’ 2412.17743 β’ Published Dec 23, 2024 β’ 64
RedPajama: an Open Dataset for Training Large Language Models Paper β’ 2411.12372 β’ Published Nov 19, 2024 β’ 50
Balancing Pipeline Parallelism with Vocabulary Parallelism Paper β’ 2411.05288 β’ Published Nov 8, 2024 β’ 19
LoLCATS Collection Linearizing LLMs with high quality and efficiency. We linearize the full Llama 3.1 model family -- 8b, 70b, 405b -- for the first time! β’ 4 items β’ Updated Oct 14, 2024 β’ 15