VIRL-L-Init / README.md
tianzhechu's picture
Create README.md
3fb89cd verified
metadata
license: mit

VIRL-L-Init

This model serves as a initial checkpoint to reproduce results in paper SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training.

Related links

Website: https://tianzhechu.com/SFTvsRL/

Github: https://github.com/LeslieTrue/SFTvsRL

Arxiv: https://arxiv.org/abs/2501.17161v1

HF: https://huggingface.co/papers/2501.17161