Update README.md
Browse files
README.md
CHANGED
@@ -2,4 +2,63 @@
|
|
2 |
license: other
|
3 |
license_name: yi-license
|
4 |
license_link: https://huggingface.co/01-ai/Yi-34B/blob/main/LICENSE
|
|
|
|
|
|
|
5 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: other
|
3 |
license_name: yi-license
|
4 |
license_link: https://huggingface.co/01-ai/Yi-34B/blob/main/LICENSE
|
5 |
+
language:
|
6 |
+
- en,
|
7 |
+
pipeline_tag: conversational
|
8 |
---
|
9 |
+
<p align="center">
|
10 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/644ba0c76ebb3ebf7264dbe9/PWn9I-0XH7kSP_YXcyxIg.png" width="400"/>
|
11 |
+
</p>
|
12 |
+
|
13 |
+
---
|
14 |
+
|
15 |
+
# SG Raccoon tess-to-capy 66B
|
16 |
+
|
17 |
+
An auto-regressive causal LM created by combining 2x finetuned [Yi 34b](https://huggingface.co/01-ai/Yi-34B) with *200K context* into one.
|
18 |
+
|
19 |
+
|
20 |
+
# Prompting Format
|
21 |
+
|
22 |
+
```
|
23 |
+
SYSTEM: <ANY SYSTEM CONTEXT>
|
24 |
+
USER:
|
25 |
+
ASSISTANT:
|
26 |
+
```
|
27 |
+
|
28 |
+
# Merge process
|
29 |
+
|
30 |
+
The models used in the merge are [Tess-M-v1.3](https://huggingface.co/migtissera/Tess-M-v1.3/) and [Nous-Capybara-34B](https://huggingface.co/NousResearch/Nous-Capybara-34B).
|
31 |
+
|
32 |
+
The layer ranges used are as follows:
|
33 |
+
|
34 |
+
```yaml
|
35 |
+
- model: ehartford/dolphin-2_2-yi-34b
|
36 |
+
layer_range: [0, 14]
|
37 |
+
- model: OrionStarAI/OrionStar-Yi-34B-Chat-Llama
|
38 |
+
layer_range: [7, 21]
|
39 |
+
- model: ehartford/dolphin-2_2-yi-34b
|
40 |
+
layer_range: [15, 29]
|
41 |
+
- model: OrionStarAI/OrionStar-Yi-34B-Chat-Llama
|
42 |
+
layer_range: [22, 36]
|
43 |
+
- model: ehartford/dolphin-2_2-yi-34b
|
44 |
+
layer_range: [30, 44]
|
45 |
+
- model: OrionStarAI/OrionStar-Yi-34B-Chat-Llama
|
46 |
+
layer_range: [37, 51]
|
47 |
+
- model: ehartford/dolphin-2_2-yi-34b
|
48 |
+
layer_range: [45, 59]
|
49 |
+
```
|
50 |
+
|
51 |
+
|
52 |
+
# Benchmarks
|
53 |
+
Coming soon.
|
54 |
+
|
55 |
+
# Acknowledgements
|
56 |
+
- Special thanks to [MSS](https://milanosamplesale.com/) for sponsoring this project
|
57 |
+
|
58 |
+
- [@chargoddard](https://huggingface.co/chargoddard) for developing the framework used to merge the model - [mergekit](https://github.com/cg123/mergekit).
|
59 |
+
|
60 |
+
- Great thanks to [@Undi95](https://huggingface.co/Undi95) for helping figuring out model merge options
|
61 |
+
|
62 |
+
- Also credits to the [01-ai](https://huggingface.co/01-ai) team for their amazing models
|
63 |
+
|
64 |
+
- This merged model is inspired by [Goliath 120B](https://huggingface.co/alpindale/goliath-120b)
|