What kind of training data used in the RL process of R1 Zero?

#14
by RitchieLeung - opened

Thanks the awesome job of DeepSeek, I got a question while I read the technique report:
what kind of training data used in the RL process of R1 Zero?

Sign up or log in to comment