File size: 5,762 Bytes
bb3150b
054d033
f29fbb4
 
418f5ad
 
bb3150b
 
 
 
054d033
 
 
bb3150b
aab6dd3
9244647
aab6dd3
b62d566
1fd634e
bb3150b
 
054d033
 
 
 
1377b62
38079fe
054d033
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7be0607
054d033
 
 
 
 
 
 
bb3150b
 
 
 
 
 
 
 
 
054d033
 
 
 
 
 
418f5ad
bb3150b
054d033
bb3150b
 
 
 
054d033
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bb3150b
054d033
 
 
 
 
 
 
 
 
 
 
 
 
 
bb3150b
054d033
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bb3150b
054d033
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
---
base_model:
- FallenMerick/Chunky-Lemon-Cookie-11B
- Sao10K/Fimbulvetr-11B-v2.1-16K
- senseable/WestLake-7B-v2
base_model_relation: merge
library_name: transformers
tags:
- mergekit
- merge
- roleplay
- text-generation-inference
license: cc-by-4.0
---

![cute](https://huggingface.co/matchaaaaa/Honey-Yuzu-13B/resolve/main/honey-yuzu-cute.png)

**Thank you [@Brooketh](https://huggingface.co/brooketh) for the [GGUFs](https://huggingface.co/backyardai/Honey-Yuzu-13B-GGUF)!!**

# Honey-Yuzu-13B

Meet Honey-Yuzu, a sweet lemony tea brewed by yours truly! A bit of [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B) here for its great flavor, with a dash of [WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) there to add some depth. I'm really proud of how it turned out, and I hope you like it too!

It's not as verbose as Chaifighter, but it still writes very well. It boasts fantastic coherence and character understanding (in my opinion) for a 13B, and it's been my daily driver for a little bit. It's a solid RP model that should generally play nice with just about anything.

**Native Context Length: 8K/8192** *(can be extended using RoPE, possibly past 16K)*

## Prompt Template: Alpaca

```
Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:
```

## Recommended Settings: Universal-Light

Here are some settings ranges that tend to work for me. They aren't strict values, and there's a bit of leeway in them. Feel free to experiment a bit!

* Temperature:        **1.0** to **1.25**
* Min-P:              **0.05** to **0.1**
* Repetition Penalty: **1.05** *to* **1.1** (high values aren't needed and usually degrade output)
* Rep. Penalty Range: **256** *or* **512**
* *(all other samplers disabled)*

## The Deets

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

### Merge Method

This model was merged using the passthrough merge method.

### Models Merged

The following models were included in the merge:
* [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B)
  * [SanjiWatsuki/Kunoichi-7B](https://huggingface.co/SanjiWatsuki/Kunoichi-7B)
  * [SanjiWatsuki/Silicon-Maid-7B](https://huggingface.co/SanjiWatsuki/Silicon-Maid-7B)
  * [KatyTheCutie/LemonadeRP-4.5.3](https://huggingface.co/KatyTheCutie/LemonadeRP-4.5.3)
  * [Fimbulvetr-11B-v2.1-16K](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K)
  * [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
* [senseable/WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2)

### The Special Sauce

The following YAML configuration was used to produce this model:

```yaml
slices: # this is a quick float32 restack of BLC using the OG recipe
  - sources:
    - model: SanjiWatsuki/Kunoichi-7B
      layer_range: [0, 24]
  - sources:
    - model: SanjiWatsuki/Silicon-Maid-7B
      layer_range: [8, 24]
  - sources:
    - model: KatyTheCutie/LemonadeRP-4.5.3
      layer_range: [24, 32]
merge_method: passthrough
dtype: float32
name: Big-Lemon-Cookie-11B
---
models: # this is a remake of CLC with the newer Fimbul v2.1 version
  - model: Big-Lemon-Cookie-11B
    parameters:
      weight: 0.85
  - model: Sao10K/Fimbulvetr-11B-v2.1-16K
    parameters:
      weight: 0.15
merge_method: linear
dtype: float32
name: Chunky-Lemon-Cookie-11B
---
slices: # 8 layers of WL for the splice
  - sources:
    - model: senseable/WestLake-7B-v2
      layer_range: [8, 16]
merge_method: passthrough
dtype: float32
name: WL-splice
---
slices: # 8 layers of CLC for the splice
  - sources:
    - model: Chunky-Lemon-Cookie-11B
      layer_range: [8, 16]
merge_method: passthrough
dtype: float32
name: CLC-splice
---
models: # this is the splice, a gradient merge meant to gradually and smoothly interpolate between stacks of different models
  - model: WL-splice
    parameters:
      weight: [1, 1, 0.75, 0.625, 0.5, 0.375, 0.25, 0, 0] # 0.125 / 0.875 values removed here - "math gets screwy" 
  - model: CLC-splice
    parameters:
      weight: [0, 0, 0.25, 0.375, 0.5, 0.625, 0.75, 1, 1] # 0.125 / 0.875 values removed here - "math gets screwy" 
merge_method: dare_linear # according to some paper, "DARE is all you need"
base_model: WL-splice
dtype: float32
name: splice
---
slices: # putting it all together
  - sources:
    - model: senseable/WestLake-7B-v2
      layer_range: [0, 16]
  - sources: 
    - model: splice
      layer_range: [0, 8]
  - sources:
    - model: Chunky-Lemon-Cookie-11B
      layer_range: [16, 48]
merge_method: passthrough
dtype: float32
name: Honey-Yuzu-13B
```

### The Thought Process

This was meant to be a simple RP-focused merge. I chose 2 well-performing RP models - [Chunky-Lemon-Cookie-11B](https://huggingface.co/FallenMerick/Chunky-Lemon-Cookie-11B) by [FallenMerick](https://huggingface.co/FallenMerick) and [WestLake-7B-v2](https://huggingface.co/senseable/WestLake-7B-v2) by [senseable](https://huggingface.co/senseable) - and merge them using a more conventional configuration (okay, okay, a 56 layer 12.5B Mistral isn't that conventional but still) rather than trying something wild or crazy and pushing the limits. I was very pleased with the results, but I wanted to see what would happen if I remade CLC with [Fimbulvetr-11B-v2.1-16K](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K) by [Sao10K](https://huggingface.co/Sao10K). This resulted in equally nice (if not slightly better) outputs but greatly improved native context length.



Have feedback? Comments? Questions? Don't hesitate to let me know! As always, have a wonderful day, and please be nice to yourself! :)