DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 12 days ago • 284
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published 25 days ago • 86
Dynamic Scaling of Unit Tests for Code Reward Modeling Paper • 2501.01054 • Published Jan 2 • 17
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published Jan 2 • 48
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey Paper • 2412.18619 • Published Dec 16, 2024 • 54
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published Dec 25, 2024 • 98
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response Paper • 2412.14922 • Published Dec 19, 2024 • 85
Running 144 🚀 Whisper Large V3 Turbo WebGPU ML-powered speech recognition directly in your browser