FelixChao's picture
Update README.md
9510554 verified
|
raw
history blame
4.76 kB
---
license: apache-2.0
language:
- en
---
# WestSeverus - 7B - DPO - v2
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a53b0747a04f0512941b6f/-_CvSGuu-kQ1GDNzVMYjg.png)
## ☘️ Model Description
WestSeverus-7B-DPO-v2 is a WestLake Family model trained over [WestSeverus-7B](https://huggingface.co/FelixChao/WestSeverus-7B).
The model was trained on several dpo datasets and it can perform well on basic math problem.
WestSeverus-7B-DPO-v2 can be used in mathematics, chemical, physics and even coding for further research and reference.
# πŸ“– Table of Contents
1. [Nous Benchmark Results](#πŸͺ„-nous-benchmark-results)
- AGIEval
- GPT4All
- TruthfulQA Scores
- BigBench
2. [Open LLM Leaderboard](#πŸ†-open-llm-leaderboard)
- ARC
- HellaSwag
- MMLU
- TruthfulQA
- Winogrande
- GSM8K
3. [EvalPlus Leaderboard](#⚑-evalplus-leaderboard)
- HumanEval
- HumanEval_Plus
- MBPP
- MBPP_Plus
4. [Prompt Format](#βš—οΈ-prompt-format)
5. [Quantized Models](#πŸ› οΈ-quantized-models)
6. [Gratitude](#πŸ™-gratitude)
## πŸͺ„ Nous Benchmark Results
WestSeverus-7B-DPO-v2 is currently on the top of the [YALL - Yet Another LLM Leaderboard](https://huggingface.co/spaces/CultriX/Yet_Another_LLM_Leaderboard) created by CultriX and it outperforms on TruthfulQA Scores and BigBench.
| Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
|---|---:|---:|---:|---:|---:|
| [**WestSeverus-7B-DPO-v2**](https://huggingface.co/FelixChao/WestSeverus-7B-DPO-v2)| **60.98**| 45.29 | 77.2| **72.72**| **48.71**|
| [CultriX/Wernicke-7B-v1](https://huggingface.co/CultriX/Wernicke-7B-v1)| 60.73| 45.59 | 77.36 | 71.46 | 48.49 |
| [mlabonne/NeuralBeagle14-7B](https://huggingface.co/mlabonne/NeuralBeagle14-7B) | 60.25 |46.06|76.77 | 70.32 |47.86 |
| [CultriX/MistralTrix-v1](https://huggingface.co/CultriX/MistralTrix-v1) | 60.05 | 44.98 | 76.62 | 71.44 | 47.17 |
| [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) | 59.42 | 44.27 | 77.86 | 67.46 | 48.09 |
| [mlabonne/Daredevil-7B](https://huggingface.co/mlabonne/Daredevil-7B) | 58.22 | 44.85 | 76.07 | 64.89 | 47.07 |
| [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) | 44.61 | 27.96 | 70.84 | 44.46 | 35.17 |
## πŸ† Open LLM Leaderboard
WestSeverus-7B-DPO-v2 is one of the top 7B model in Open LLM Leaderboard and it outperforms on TruthfulQA and GSM8K.
| Metric |Value|
|---------------------------------|----:|
|Avg. |75.29|
|AI2 Reasoning Challenge (25-Shot)|71.42|
|HellaSwag (10-Shot) |88.27|
|MMLU (5-Shot) |64.79|
|TruthfulQA (0-shot) |72.37|
|Winogrande (5-shot) |83.27|
|GSM8k (5-shot) |71.65|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_FelixChao__WestSeverus-7B-DPO-v2)
## ⚑ EvalPlus Leaderboard
| Model | HumanEval | HumanEval_Plus| MBPP | MBPP_Plus |
|---|---:|---:|---:|---:|
| phi-2-2.7B |48.2|43.3|61.9|51.4|
| **WestSeverus-7B-DPO-v2**| 43.3 | 34.1 |TBD |TBD |
| SOLAR-10.7B-Instruct-v1.0 | 42.1 | 34.3 | 42.9 | 34.6 |
| CodeLlama-7B| 37.8| 34.1 | 57.6 |45.4 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a53b0747a04f0512941b6f/lL72F41NUueFMP7p-fPl7.png)
## βš—οΈ Prompt Format
WestSeverus-7B-DPO-v2 was trained using the ChatML prompt templates with system prompts. An example follows below:
```
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
```
## πŸ› οΈ Quantized Models
### Another version of WestSeverus Model:
* [**PetroGPT/WestSeverus-7B-DPO**](https://huggingface.co/PetroGPT/WestSeverus-7B-DPO)
* **GGUF**: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-GGUF
* **GGUF**: https://huggingface.co/s3nh/WestSeverus-7B-DPO-GGUF
* **GPTQ**: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-GPTQ
* **AWQ**: https://huggingface.co/TheBloke/WestSeverus-7B-DPO-AWQ
### MaziyarPanahi/WestSeverus-7B-DPO-v2-GGUF
* **GGUF**: https://huggingface.co/MaziyarPanahi/WestSeverus-7B-DPO-v2-GGUF
## πŸ™ Gratitude
* Thanks to @senseable for [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2).
* Thanks to @jondurbin for [jondurbin/truthy-dpo-v0.1 dataset](https://huggingface.co/datasets/jondurbin/truthy-dpo-v0.1).
* Thanks to @Charles Goddard for MergeKit.
* Thanks to @TheBloke, @s3nh, @MaziyarPanahi for Quantized Models.
* Thanks to @mlabonne, @CultriX for YALL - Yet Another LLM Leaderboard.
* Thank you to all the other people in the Open Source AI community who utilized this model for further research and improvement.