Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? Paper • 2502.19361 • Published 8 days ago • 24
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning Paper • 2502.17407 • Published 10 days ago • 24
Language Models can Self-Improve at State-Value Estimation for Better Search Paper • 2503.02878 • Published 2 days ago • 7
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs Paper • 2503.01307 • Published 3 days ago • 24
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers Paper • 2502.20545 • Published 7 days ago • 18
LADDER: Self-Improving LLMs Through Recursive Problem Decomposition Paper • 2503.00735 • Published 5 days ago • 14