dpo-phi2 / README.md
Amu's picture
Update README.md
20247de verified
|
raw
history blame
202 Bytes
metadata
license: apache-2.0

dpo-phi2 is an instruction-tuned model from microsoft/phi-2. Direct preference optimization (DPO) is used for fine-tuning on argilla/distilabel-intel-orca-dpo-pairs dataset.