Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Paper • 2501.12895 • Published 4 days ago • 45
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published 4 days ago • 66
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 4 days ago • 187
PaSa: An LLM Agent for Comprehensive Academic Paper Search Paper • 2501.10120 • Published 9 days ago • 37
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 12 days ago • 268
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 12 days ago • 47
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published 13 days ago • 86
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published 16 days ago • 59
Agent Laboratory: Using LLM Agents as Research Assistants Paper • 2501.04227 • Published 18 days ago • 81
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published 17 days ago • 80
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 18 days ago • 89
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 18 days ago • 248
Cosmos World Foundation Model Platform for Physical AI Paper • 2501.03575 • Published 19 days ago • 66
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models Paper • 2501.03124 • Published 20 days ago • 14
Test-time Computing: from System-1 Thinking to System-2 Thinking Paper • 2501.02497 • Published 21 days ago • 41
Personalized Graph-Based Retrieval for Large Language Models Paper • 2501.02157 • Published 22 days ago • 28
Are Vision-Language Models Truly Understanding Multi-vision Sensor? Paper • 2412.20750 • Published 27 days ago • 20