athirdpath commited on
Commit
ab4b168
1 Parent(s): 3eb390b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -1
README.md CHANGED
@@ -2,4 +2,36 @@
2
 
3
  I have a theory! But, I have to go to bed, so I'm setting this to upload while I sleep.
4
 
5
- The 13Bs struggled because they were inherently lopsided. So, with this layout, I not only free up more parameters for further finetuning, I also address the imbalance. Crazy? Maybe.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
  I have a theory! But, I have to go to bed, so I'm setting this to upload while I sleep.
4
 
5
+ The 13Bs struggled because they were inherently lopsided. So, with this layout, I not only free up more parameters for further finetuning, I also address the imbalance. Crazy? Maybe.
6
+
7
+ ### Recipe
8
+
9
+ slices:
10
+
11
+ - sources:
12
+
13
+ - model: mistralai/Mistral-7B-v0.1
14
+
15
+ layer_range: [0, 25]
16
+
17
+ - sources:
18
+
19
+ - model: mistralai/Mistral-7B-v0.1
20
+
21
+ layer_range: [7, 25]
22
+
23
+ - sources:
24
+
25
+ - model: mistralai/Mistral-7B-v0.1
26
+
27
+ layer_range: [7, 25]
28
+
29
+ - sources:
30
+
31
+ - model: mistralai/Mistral-7B-v0.1
32
+
33
+ layer_range: [7, 32]
34
+
35
+ merge_method: passthrough
36
+
37
+ dtype: bfloat16