Training/finetuning code?

by milsunone - opened Oct 12, 2023

Discussion

milsunone

Oct 12, 2023

Can you share finetuning code using DPO used, or an ETA on when the code will be available?

lewtun

Hugging Face H4 org Oct 13, 2023

Hello @milsunone we'll be releasing the DPO training code soon in the Alignment Handbook we're working on: https://github.com/huggingface/alignment-handbook

In the meantime, you can adapt the script from TRL which is quite similar to what we'll release: https://github.com/huggingface/trl/blob/main/examples/scripts/dpo.py

jiantongxu

Oct 15, 2023

Hello @milsunone we'll be releasing the DPO training code soon in the Alignment Handbook we're working on: https://github.com/huggingface/alignment-handbook

In the meantime, you can adapt the script from TRL which is quite similar to what we'll release: https://github.com/huggingface/trl/blob/main/examples/scripts/dpo.py

Great, could you also share what datasets are used during fine-tuning? It will be a great reference for fine-tune learning :)

EnmingYuan

Oct 20, 2023

How did you set the beta in DPO?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment