Mistral 7B Zephyr Orpo
The Zephyr Orpo recipe applied on top of Mistral 7B v0.2 (new recipe with new Mistral base model)
Model description
- Model type: A 7.2B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
- Language(s) (NLP): Primarily English
- Finetuned from model: wandb/Mistral-7B-v0.2
Recipe
We trained using the alignment handbook recipe and logging to W&B
Visit the W&B workspace here
Results:
- MT bench
########## First turn ##########
score
model turn
zephyr-orpo-7b-v0.2 1 7.44375
########## Second turn ##########
score
model turn
zephyr-orpo-7b-v0.2 2 6.875
########## Average ##########
score
model
zephyr-orpo-7b-v0.2 7.159375
Trained on a single H100 for 2 hours!
- Downloads last month
- 7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for wandb/zephyr-orpo-7b-v0.2
Base model
wandb/Mistral-7B-v0.2