|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
--- |
|
# WestSeverus - 7B - DPO - v2 |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a53b0747a04f0512941b6f/-_CvSGuu-kQ1GDNzVMYjg.png) |
|
|
|
## βοΈ Model Description |
|
|
|
WestSeverus-7B-DPO-v2 is a WestLake Family model trained over [WestSeverus-7B](https://huggingface.co/FelixChao/WestSeverus-7B). |
|
|
|
The model was trained on several dpo datasets and it can perform well on basic math problem. |
|
|
|
WestSeverus-7B-DPO-v2 can be used in mathematics, chemical, physics and even coding for further research and reference. |
|
|
|
# π Table of Contents |
|
1. [Nous Benchmark Results](#πͺ-nous-benchmark-results) |
|
- AGIEval |
|
- GPT4All |
|
- TruthfulQA Scores |
|
- BigBench |
|
|
|
2. [Open LLM Leaderboard](#π-open-llm-leaderboard) |
|
- ARC |
|
- HellaSwag |
|
- MMLU |
|
- TruthfulQA |
|
- Winogrande |
|
- GSM8K |
|
3. [EvalPlus Leaderboard](#β‘-evalplus-leaderboard) |
|
- HumanEval |
|
- HumanEval_Plus |
|
- MBPP |
|
- MBPP_Plus |
|
4. [Prompt Format](#βοΈ-prompt-format) |
|
5. [Quantized Models](#π οΈ-quantized-models) |
|
6. [Gratitude](#π-gratitude) |
|
|
|
## πͺ Nous Benchmark Results |
|
|
|
WestSeverus-7B-DPO-v2 is currently on the top of the [YALL - Yet Another LLM Leaderboard](https://huggingface.co/spaces/CultriX/Yet_Another_LLM_Leaderboard) created by CultriX and it outperforms on TruthfulQA Scores and BigBench. |
|
|
|
| Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench | |
|
|---|---:|---:|---:|---:|---:| |
|
| [**WestSeverus-7B-DPO-v2**](https://huggingface.co/FelixChao/WestSeverus-7B-DPO-v2)| **60.98**| 45.29 | 77.2| **72.72**| **48.71**| |
|
| [CultriX/Wernicke-7B-v1](https://huggingface.co/CultriX/Wernicke-7B-v1)| 60.73| 45.59 | 77.36 | 71.46 | 48.49 | |
|
| [mlabonne/NeuralBeagle14-7B](https://huggingface.co/mlabonne/NeuralBeagle14-7B) | 60.25 |46.06|76.77 | 70.32 |47.86 | |
|
| [CultriX/MistralTrix-v1](https://huggingface.co/CultriX/MistralTrix-v1) | 60.05 | 44.98 | 76.62 | 71.44 | 47.17 | |
|
| [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) | 59.42 | 44.27 | 77.86 | 67.46 | 48.09 | |
|
| [mlabonne/Daredevil-7B](https://huggingface.co/mlabonne/Daredevil-7B) | 58.22 | 44.85 | 76.07 | 64.89 | 47.07 | |
|
| [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) | 44.61 | 27.96 | 70.84 | 44.46 | 35.17 | |
|
|
|
## π Open LLM Leaderboard |
|
|
|
WestSeverus-7B-DPO-v2 is one of the top 7B model in Open LLM Leaderboard and it outperforms on TruthfulQA and GSM8K. |
|
|
|
| Metric |Value| |
|
|---------------------------------|----:| |
|
|Avg. |75.29| |
|
|AI2 Reasoning Challenge (25-Shot)|71.42| |
|
|HellaSwag (10-Shot) |88.27| |
|
|MMLU (5-Shot) |64.79| |
|
|TruthfulQA (0-shot) |72.37| |
|
|Winogrande (5-shot) |83.27| |
|
|GSM8k (5-shot) |71.65| |
|
|
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_FelixChao__WestSeverus-7B-DPO-v2) |
|
|
|
## β‘ EvalPlus Leaderboard |
|
|
|
| Model | HumanEval | HumanEval_Plus| MBPP | MBPP_Plus | |
|
|---|---:|---:|---:|---:| |
|
| phi-2-2.7B |48.2|43.3|61.9|51.4| |
|
| **WestSeverus-7B-DPO-v2**| 43.3 | 34.1 |TBD |TBD | |
|
| SOLAR-10.7B-Instruct-v1.0 | 42.1 | 34.3 | 42.9 | 34.6 | |
|
| CodeLlama-7B| 37.8| 34.1 | 57.6 |45.4 | |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a53b0747a04f0512941b6f/lL72F41NUueFMP7p-fPl7.png) |
|
|
|
## βοΈ Prompt Format |
|
|
|
WestSeverus-7B-DPO-v2 was trained using the ChatML prompt templates with system prompts. An example follows below: |
|
|
|
``` |
|
<|im_start|>system |
|
{system_message}<|im_end|> |
|
<|im_start|>user |
|
{prompt}<|im_end|> |
|
<|im_start|>assistant |
|
``` |
|
|
|
## π οΈ Quantized Models |
|
|
|
### Another version of WestSeverus Model: |
|
|
|
* [**PetroGPT/WestSeverus-7B-DPO**](https://huggingface.co/PetroGPT/WestSeverus-7B-DPO) |
|
|
|
* **GGUF**: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-GGUF |
|
* **GGUF**: https://huggingface.co/s3nh/WestSeverus-7B-DPO-GGUF |
|
* **GPTQ**: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-GPTQ |
|
* **AWQ**: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-AWQ |
|
|
|
### MaziyarPanahi/WestSeverus-7B-DPO-v2-GGUF |
|
|
|
* **GGUF**: https://huggingface.co/MaziyarPanahi/WestSeverus-7B-DPO-v2-GGUF |
|
|
|
## π Gratitude |
|
|
|
* Thanks to @senseable for [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2). |
|
* Thanks to @jondurbin for [jondurbin/truthy-dpo-v0.1 dataset](https://huggingface.co/datasets/jondurbin/truthy-dpo-v0.1). |
|
* Thanks to @Charles Goddard for MergeKit. |
|
* Thanks to @TheBloke, @s3nh, @MaziyarPanahi for Quantized Models. |
|
* Thanks to @mlabonne, @CultriX for YALL - Yet Another LLM Leaderboard. |
|
* Thank you to all the other people in the Open Source AI community who utilized this model for further research and improvement. |
|
|