llmixer commited on
Commit
aac679d
1 Parent(s): 68923a5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +126 -0
README.md ADDED
@@ -0,0 +1,126 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - 152334H/miqu-1-70b-sf
4
+ license: unknown
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - merge
10
+ - frankenmerge
11
+ - 122b
12
+ ---
13
+ # BigWeave v29 122b
14
+
15
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/65a6db055c58475cf9e6def1/4CbbAN-X7ZWj702JrcCGH.png" width=600>
16
+
17
+ The BigWeave models aim to experimentally identify merge settings for increasing model performance. The version number merely tracks various attempts and is not a quality indicator. Only results demonstrating good performance are retained and shared.
18
+
19
+ # Prompting Format
20
+ Chatml, Mistral, Vicuna.
21
+
22
+ # Merge process
23
+ This is a self-merge of 152334H/miqu-1-70b-sf. Layers are repeated in groups of 4 with a 2 layer overlap. The first and last 8/9 layers are not repeated.
24
+
25
+ Merge configuration:
26
+ ```
27
+ slices:
28
+ - sources:
29
+ - model: 152334H/miqu-1-70b-sf
30
+ layer_range: [0,11]
31
+ - sources:
32
+ - model: 152334H/miqu-1-70b-sf
33
+ layer_range: [9,13]
34
+ - sources:
35
+ - model: 152334H/miqu-1-70b-sf
36
+ layer_range: [11,15]
37
+ - sources:
38
+ - model: 152334H/miqu-1-70b-sf
39
+ layer_range: [13,17]
40
+ - sources:
41
+ - model: 152334H/miqu-1-70b-sf
42
+ layer_range: [15,19]
43
+ - sources:
44
+ - model: 152334H/miqu-1-70b-sf
45
+ layer_range: [17,21]
46
+ - sources:
47
+ - model: 152334H/miqu-1-70b-sf
48
+ layer_range: [19,23]
49
+ - sources:
50
+ - model: 152334H/miqu-1-70b-sf
51
+ layer_range: [21,25]
52
+ - sources:
53
+ - model: 152334H/miqu-1-70b-sf
54
+ layer_range: [23,27]
55
+ - sources:
56
+ - model: 152334H/miqu-1-70b-sf
57
+ layer_range: [25,29]
58
+ - sources:
59
+ - model: 152334H/miqu-1-70b-sf
60
+ layer_range: [27,31]
61
+ - sources:
62
+ - model: 152334H/miqu-1-70b-sf
63
+ layer_range: [29,33]
64
+ - sources:
65
+ - model: 152334H/miqu-1-70b-sf
66
+ layer_range: [31,35]
67
+ - sources:
68
+ - model: 152334H/miqu-1-70b-sf
69
+ layer_range: [33,37]
70
+ - sources:
71
+ - model: 152334H/miqu-1-70b-sf
72
+ layer_range: [35,39]
73
+ - sources:
74
+ - model: 152334H/miqu-1-70b-sf
75
+ layer_range: [37,41]
76
+ - sources:
77
+ - model: 152334H/miqu-1-70b-sf
78
+ layer_range: [39,43]
79
+ - sources:
80
+ - model: 152334H/miqu-1-70b-sf
81
+ layer_range: [41,45]
82
+ - sources:
83
+ - model: 152334H/miqu-1-70b-sf
84
+ layer_range: [43,47]
85
+ - sources:
86
+ - model: 152334H/miqu-1-70b-sf
87
+ layer_range: [45,49]
88
+ - sources:
89
+ - model: 152334H/miqu-1-70b-sf
90
+ layer_range: [47,51]
91
+ - sources:
92
+ - model: 152334H/miqu-1-70b-sf
93
+ layer_range: [49,53]
94
+ - sources:
95
+ - model: 152334H/miqu-1-70b-sf
96
+ layer_range: [51,55]
97
+ - sources:
98
+ - model: 152334H/miqu-1-70b-sf
99
+ layer_range: [53,57]
100
+ - sources:
101
+ - model: 152334H/miqu-1-70b-sf
102
+ layer_range: [55,59]
103
+ - sources:
104
+ - model: 152334H/miqu-1-70b-sf
105
+ layer_range: [57,61]
106
+ - sources:
107
+ - model: 152334H/miqu-1-70b-sf
108
+ layer_range: [59,63]
109
+ - sources:
110
+ - model: 152334H/miqu-1-70b-sf
111
+ layer_range: [61,65]
112
+ - sources:
113
+ - model: 152334H/miqu-1-70b-sf
114
+ layer_range: [63,67]
115
+ - sources:
116
+ - model: 152334H/miqu-1-70b-sf
117
+ layer_range: [65,69]
118
+ - sources:
119
+ - model: 152334H/miqu-1-70b-sf
120
+ layer_range: [67,71]
121
+ - sources:
122
+ - model: 152334H/miqu-1-70b-sf
123
+ layer_range: [69,80]
124
+ merge_method: passthrough
125
+ dtype: float16
126
+ ```