AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts Paper • 2402.07625 • Published Feb 12, 2024 • 14
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 4 days ago • 186
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos Paper • 2501.09781 • Published 10 days ago • 21
Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise Paper • 2501.08331 • Published 12 days ago • 17
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published 6 days ago • 74
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models Paper • 2501.11873 • Published 5 days ago • 59
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics Paper • 2501.04686 • Published 18 days ago • 50
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 125
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 12 days ago • 47
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 12 days ago • 268
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning Paper • 2501.06458 • Published 15 days ago • 29
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published 13 days ago • 86
AdaLomo: Low-memory Optimization with Adaptive Learning Rate Paper • 2310.10195 • Published Oct 16, 2023 • 2
Full Parameter Fine-tuning for Large Language Models with Limited Resources Paper • 2306.09782 • Published Jun 16, 2023 • 30