Organization Card

Introduction

We aim to advance LLM reasoning to enable LLMs with autoregressive search capabilities, where a single LLM performs an extended reasoning process with self-reflection and self-exploration of new strategies. We achieve this through our proposed Chain-of-Action-Thought (COAT) reasoning and a new post-training paradigm: 1) a small-scale format tuning (FT) stage to internalize the COAT reasoning format and 2) a large-scale self-improvement stage leveraging reinforcement learning (RL). Our approach results in Satori, a 7B LLM trained on open-source model (Qwen-2.5-Math-7B) and open-source data (OpenMathInstruct-2 and NuminaMath). Key features of Satori include:

Capable of self-reflection and self-exploration without external guidance.
Achieve state-of-the-art reasoning performance mainly through self-improvement (RL).
Exhibit transferability of reasoning capabilities on unseen domains beyond math.

Resources

Please refer to our blog and research paper for more technical details of Satori.

Blog
Paper

Citation

If you find our model and data helpful, please cite our paper:

@misc{shen2025satorireinforcementlearningchainofactionthought,
      title={Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search}, 
      author={Maohao Shen and Guangtao Zeng and Zhenting Qi and Zhang-Wei Hong and Zhenfang Chen and Wei Lu and Gregory Wornell and Subhro Das and David Cox and Chuang Gan},
      year={2025},
      eprint={2502.02508},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.02508}, 
}

models 1

Satori-reasoning/Satori-7B-Round2

Updated about 8 hours ago • 52 • 1

datasets 2

Satori-reasoning/Satori_FT_data

Viewer • Updated 1 day ago • 857k

Satori-reasoning/Satori_RL_data

Viewer • Updated 1 day ago • 545k

Satori

AI & ML interests

Recent Activity

Introduction

Resources

Citation

models 1

Satori-reasoning/Satori-7B-Round2

datasets 2

Satori-reasoning/Satori_FT_data

Satori-reasoning/Satori_RL_data

AI & ML interests

Recent Activity

Team members 3

Introduction

Resources

Citation

models 1

datasets 2 Sort: Recently updated

datasets 2