File size: 3,137 Bytes
7b4f3b1
 
 
 
 
 
 
0f4897a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7b4f3b1
0f4897a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7c70a55
 
 
 
 
 
 
 
 
 
 
2c46008
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
---
license: cc-by-nc-4.0
tags:
- not-for-all-audiences
- nsfw
---

First :
```shell
layer_slices:
  - model: Undi95/MLewd-L2-Chat-13B
    start: 0
    end: 16
  - model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
    start: 8
    end: 20
  - model: Undi95/MLewd-L2-Chat-13B
    start: 17
    end: 32
  - model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
    start: 21
    end: 40
```

Inverted:
```shell
layer_slices:
  - model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
    start: 0
    end: 16
  - model: Undi95/MLewd-L2-Chat-13B
    start: 8
    end: 20
  - model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
    start: 17
    end: 32
  - model: Undi95/MLewd-L2-Chat-13B
    start: 21
    end: 40
```

Precise:
```shell
layer_slices:
  - model: Undi95/MLewd-L2-Chat-13B
    start: 0
    end: 8
  - model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
    start: 4
    end: 12
  - model: Undi95/MLewd-L2-Chat-13B
    start: 9
    end: 16
  - model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
    start: 13
    end: 22
  - model: Undi95/MLewd-L2-Chat-13B
    start: 17
    end: 24
  - model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
    start: 23
    end: 32
  - model: Undi95/MLewd-L2-Chat-13B
    start: 25
    end: 32
  - model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
    start: 33
    end: 40
```

PreciseInverted:
```shell
layer_slices:
  - model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
    start: 0
    end: 8
  - model: Undi95/MLewd-L2-Chat-13B
    start: 4
    end: 12
  - model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
    start: 9
    end: 16
  - model: Undi95/MLewd-L2-Chat-13B
    start: 13
    end: 22
  - model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
    start: 17
    end: 24
  - model: Undi95/MLewd-L2-Chat-13B
    start: 23
    end: 32
  - model: Undi95/MLewd-ReMM-L2-Chat-20B-Part1
    start: 25
    end: 32
  - model: Undi95/MLewd-L2-Chat-13B
    start: 33
    end: 40
```

Part1 = ReMM v2.1 merged /w MLewd low weight to keep consistency. I call this "dilution" and result show consistency and coherency without repeat/loop beside the small amount of duplicated datas.

The goal is to find the best way to interlace layers the best way possible to have a sweetspot between 13B and +30B.

Normal/Inverted is by chunk of 16 layers and Precise/PreciseInverted is by chunk of 8 layers.

All the models are made of 64(+1) layers. Need testing.

## Prompt template: Alpaca

```
Below is an instruction that describes a task. Write a response that completes the request.

### Instruction:
{prompt}

### Response:
```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Undi95__MLewd-ReMM-L2-Chat-20B-Inverted)

| Metric                | Value                     |
|-----------------------|---------------------------|
| Avg.                  | 50.81   |
| ARC (25-shot)         | 61.69          |
| HellaSwag (10-shot)   | 85.32    |
| MMLU (5-shot)         | 58.0         |
| TruthfulQA (0-shot)   | 53.77   |
| Winogrande (5-shot)   | 75.61   |
| GSM8K (5-shot)        | 9.1        |
| DROP (3-shot)         | 12.16         |