File size: 5,762 Bytes
bb3150b 054d033 f29fbb4 418f5ad bb3150b 054d033 bb3150b aab6dd3 9244647 aab6dd3 b62d566 1fd634e bb3150b 054d033 1377b62 38079fe 054d033 7be0607 054d033 bb3150b 054d033 418f5ad bb3150b 054d033 bb3150b 054d033 bb3150b 054d033 bb3150b 054d033 bb3150b 054d033 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
---
base_model:
- FallenMerick/Chunky-Lemon-Cookie-11B
- Sao10K/Fimbulvetr-11B-v2.1-16K
- senseable/WestLake-7B-v2
base_model_relation: merge
library_name: transformers
tags:
- mergekit
- merge
- roleplay
- text-generation-inference
license: cc-by-4.0
---
![cute](https://huggingface.co/matchaaaaa/Honey-Yuzu-13B/resolve/main/honey-yuzu-cute.png)
**Thank you [@Brooketh](https://huggingface.co/brooketh) for the [GGUFs](https://huggingface.co/backyardai/Honey-Yuzu-13B-GGUF)!!**
# Honey-Yuzu-13B
Meet Honey-Yuzu, a sweet lemony tea brewed by yours truly! A bit of [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B) here for its great flavor, with a dash of [WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) there to add some depth. I'm really proud of how it turned out, and I hope you like it too!
It's not as verbose as Chaifighter, but it still writes very well. It boasts fantastic coherence and character understanding (in my opinion) for a 13B, and it's been my daily driver for a little bit. It's a solid RP model that should generally play nice with just about anything.
**Native Context Length: 8K/8192** *(can be extended using RoPE, possibly past 16K)*
## Prompt Template: Alpaca
```
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{prompt}
### Response:
```
## Recommended Settings: Universal-Light
Here are some settings ranges that tend to work for me. They aren't strict values, and there's a bit of leeway in them. Feel free to experiment a bit!
* Temperature: **1.0** to **1.25**
* Min-P: **0.05** to **0.1**
* Repetition Penalty: **1.05** *to* **1.1** (high values aren't needed and usually degrade output)
* Rep. Penalty Range: **256** *or* **512**
* *(all other samplers disabled)*
## The Deets
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
### Merge Method
This model was merged using the passthrough merge method.
### Models Merged
The following models were included in the merge:
* [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B)
* [SanjiWatsuki/Kunoichi-7B](https://huggingface.co/SanjiWatsuki/Kunoichi-7B)
* [SanjiWatsuki/Silicon-Maid-7B](https://huggingface.co/SanjiWatsuki/Silicon-Maid-7B)
* [KatyTheCutie/LemonadeRP-4.5.3](https://huggingface.co/KatyTheCutie/LemonadeRP-4.5.3)
* [Fimbulvetr-11B-v2.1-16K](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K)
* [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
* [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)
### The Special Sauce
The following YAML configuration was used to produce this model:
```yaml
slices: # this is a quick float32 restack of BLC using the OG recipe
- sources:
- model: SanjiWatsuki/Kunoichi-7B
layer_range: [0, 24]
- sources:
- model: SanjiWatsuki/Silicon-Maid-7B
layer_range: [8, 24]
- sources:
- model: KatyTheCutie/LemonadeRP-4.5.3
layer_range: [24, 32]
merge_method: passthrough
dtype: float32
name: Big-Lemon-Cookie-11B
---
models: # this is a remake of CLC with the newer Fimbul v2.1 version
- model: Big-Lemon-Cookie-11B
parameters:
weight: 0.85
- model: Sao10K/Fimbulvetr-11B-v2.1-16K
parameters:
weight: 0.15
merge_method: linear
dtype: float32
name: Chunky-Lemon-Cookie-11B
---
slices: # 8 layers of WL for the splice
- sources:
- model: senseable/WestLake-7B-v2
layer_range: [8, 16]
merge_method: passthrough
dtype: float32
name: WL-splice
---
slices: # 8 layers of CLC for the splice
- sources:
- model: Chunky-Lemon-Cookie-11B
layer_range: [8, 16]
merge_method: passthrough
dtype: float32
name: CLC-splice
---
models: # this is the splice, a gradient merge meant to gradually and smoothly interpolate between stacks of different models
- model: WL-splice
parameters:
weight: [1, 1, 0.75, 0.625, 0.5, 0.375, 0.25, 0, 0] # 0.125 / 0.875 values removed here - "math gets screwy"
- model: CLC-splice
parameters:
weight: [0, 0, 0.25, 0.375, 0.5, 0.625, 0.75, 1, 1] # 0.125 / 0.875 values removed here - "math gets screwy"
merge_method: dare_linear # according to some paper, "DARE is all you need"
base_model: WL-splice
dtype: float32
name: splice
---
slices: # putting it all together
- sources:
- model: senseable/WestLake-7B-v2
layer_range: [0, 16]
- sources:
- model: splice
layer_range: [0, 8]
- sources:
- model: Chunky-Lemon-Cookie-11B
layer_range: [16, 48]
merge_method: passthrough
dtype: float32
name: Honey-Yuzu-13B
```
### The Thought Process
This was meant to be a simple RP-focused merge. I chose 2 well-performing RP models - [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B) by [FallenMerick](https://huggingface.co/FallenMerick) and [WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) by [senseable](https://huggingface.co/senseable) - and merge them using a more conventional configuration (okay, okay, a 56 layer 12.5B Mistral isn't that conventional but still) rather than trying something wild or crazy and pushing the limits. I was very pleased with the results, but I wanted to see what would happen if I remade CLC with [Fimbulvetr-11B-v2.1-16K](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K) by [Sao10K](https://huggingface.co/Sao10K). This resulted in equally nice (if not slightly better) outputs but greatly improved native context length.
Have feedback? Comments? Questions? Don't hesitate to let me know! As always, have a wonderful day, and please be nice to yourself! :) |