kaist-ai
/

mistral-orpo-beta

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

JW17 commited on Mar 14

Commit

e74a696

•

1 Parent(s): 382418f

Add official github repo

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -162,6 +162,8 @@ model-index:
 **Mistral-ORPO** is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) using the *odds ratio preference optimization (ORPO)*. With ORPO, the model directly learns the preference without the supervised fine-tuning warmup phase. **Mistral-ORPO-β** is fine-tuned exclusively on the 61k instances of the cleaned version of UltraFeedback, [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned), by [Argilla](https://huggingface.co/argilla).
 ## 👍 **Model Performance**
 ### 1) AlpacaEval & MT-Bench

 **Mistral-ORPO** is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) using the *odds ratio preference optimization (ORPO)*. With ORPO, the model directly learns the preference without the supervised fine-tuning warmup phase. **Mistral-ORPO-β** is fine-tuned exclusively on the 61k instances of the cleaned version of UltraFeedback, [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned), by [Argilla](https://huggingface.co/argilla).
+- **Github Repository**: https://github.com/xfactlab/orpo
 ## 👍 **Model Performance**
 ### 1) AlpacaEval & MT-Bench