Running 1.87k 1.87k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 336