metadata
license: apache-2.0
dpo-phi2 is an instruction-tuned model from microsoft/phi-2. Direct preference optimization (DPO) is used for fine-tuning on argilla/distilabel-intel-orca-dpo-pairs dataset.
license: apache-2.0
dpo-phi2 is an instruction-tuned model from microsoft/phi-2. Direct preference optimization (DPO) is used for fine-tuning on argilla/distilabel-intel-orca-dpo-pairs dataset.