Wukong: Towards a Scaling Law for Large-Scale Recommendation Paper • 2403.02545 • Published Mar 4, 2024 • 17
Secrets of RLHF in Large Language Models Part I: PPO Paper • 2307.04964 • Published Jul 11, 2023 • 28