冒昧的问一句，能否开源微调和PPO的实现代码吗？

by xiao111 - opened Aug 21, 2023

Aug 21, 2023

fb700

Owner Aug 21, 2023

计划主模型300❤开源

Aug 21, 2023

感谢作者，我想多咨询一下，实现 ppo 是用 chatglm 自己单独训练了奖励模型吗？

fb700

Owner Aug 21, 2023

不全是，是叠加自己训练的rm模型

Aug 21, 2023

不全是，是叠加自己训练的rm模型

那么最后的 ppo，是主要是 trl 库还是其他库实现的呀。

fb700

Owner Aug 21, 2023

peft

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment