Severian commited on
Commit
1d0e3a5
1 Parent(s): 7cacd75

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +233 -4
README.md CHANGED
@@ -6,18 +6,247 @@ datasets:
6
  pipeline_tag: text-generation
7
  ---
8
 
9
- # New Fixed Version with extended training being uploaded by end of day 3/5!
10
 
 
11
 
12
  ## Unfortunately there are some issues with this current model in how it was fused during training, leading to bad outputs. I am retraining and will reupload ASAP. In the meantime you can still use the Q8 GGUF version which works great.
13
 
 
 
 
 
 
 
 
 
14
  ## GGUF Q8 Version: https://huggingface.co/Severian/Nexus-IKM-Mistral-7B-GGUF
15
 
16
 
17
  **If you'd like to train your own version, here is the full notebook to recreate the training on Unsloth yourself (https://colab.research.google.com/drive/1828t77iO2nLRXVfB8HoI11eFu-79-Oe7?usp=sharing). You'll just have to drop in the train.jsonl from the Dataset repo (https://huggingface.co/datasets/Severian/Internal-Knowledge-Map) into your Colab directory and rename it dataset.jsonl**
18
 
19
- This model is the second trained with experimental 'Internal Knowledge Map' dataset. Developed with an aim to go beyond the scope of usual data processing capabilities, this model gets trained to build comprehensive understanding and reasoning in a wide range of knowledge domains with elaborate guidelines. It bases its reasoning on a specially selected dataset emphasizing the interrelations of the diverse disciplines which aim to synthesize, integrate, and apply complex information in ways that mimic humanly abstract reasoning and creative thought processes.
20
 
21
- At the very core of the development of this model is the desire to make sure that LLMs engage in a kind of cognitive activity not limited to memory but actually taking on abstract reasoning, problem-solving, and generation of new insights. To achieve this, 'Nexus-IKM-Mistral-7B' has been fine-tuned until 10 Epochs on this unique dataset, which resulted in the model demonstrating greater capability for giving rise to insights and problem-solving in complex, multi-disciplinary settings. This involves improved ability in drawing links between different pieces of knowledge, reasoning through complex scenarios, and proposing innovative solutions that cut across various domains, including science, technology, environmental studies, and humanities.
22
 
23
- Test this out and see if you find anything interesting or intriguing. I will keep iterating more versions but this one seems like a fun and useful way to start.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  pipeline_tag: text-generation
7
  ---
8
 
9
+ # New Fixed Version with extended training available now!
10
 
11
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/64740cf7485a7c8e1bd51ac9/GO4MY_3adP2G9EHKZbZpg.webp" width="500" height="500">
12
 
13
  ## Unfortunately there are some issues with this current model in how it was fused during training, leading to bad outputs. I am retraining and will reupload ASAP. In the meantime you can still use the Q8 GGUF version which works great.
14
 
15
+
16
+ This model is the second trained with experimental 'Internal Knowledge Map' dataset. Developed with an aim to go beyond the scope of usual data processing capabilities, this model gets trained to build comprehensive understanding and reasoning in a wide range of knowledge domains with elaborate guidelines. It bases its reasoning on a specially selected dataset emphasizing the interrelations of the diverse disciplines which aim to synthesize, integrate, and apply complex information in ways that mimic humanly abstract reasoning and creative thought processes.
17
+
18
+ At the very core of the development of this model is the desire to make sure that LLMs engage in a kind of cognitive activity not limited to memory but actually taking on abstract reasoning, problem-solving, and generation of new insights. To achieve this, 'Nexus-IKM-Mistral-7B' has been fine-tuned until 10 Epochs on this unique dataset, which resulted in the model demonstrating greater capability for giving rise to insights and problem-solving in complex, multi-disciplinary settings. This involves improved ability in drawing links between different pieces of knowledge, reasoning through complex scenarios, and proposing innovative solutions that cut across various domains, including science, technology, environmental studies, and humanities.
19
+
20
+ Test this out and see if you find anything interesting or intriguing. I will keep iterating more versions but this one seems like a fun and useful way to start.
21
+
22
+
23
  ## GGUF Q8 Version: https://huggingface.co/Severian/Nexus-IKM-Mistral-7B-GGUF
24
 
25
 
26
  **If you'd like to train your own version, here is the full notebook to recreate the training on Unsloth yourself (https://colab.research.google.com/drive/1828t77iO2nLRXVfB8HoI11eFu-79-Oe7?usp=sharing). You'll just have to drop in the train.jsonl from the Dataset repo (https://huggingface.co/datasets/Severian/Internal-Knowledge-Map) into your Colab directory and rename it dataset.jsonl**
27
 
 
28
 
29
+ ## Training Snapshot
30
 
31
+ ```
32
+
33
+ Step Training Loss
34
+ 1 3.223000
35
+ 2 3.221300
36
+ 3 3.215900
37
+ 4 3.210600
38
+ 5 3.203000
39
+ 6 3.193500
40
+ 7 3.184000
41
+ 8 3.173400
42
+ 9 3.162400
43
+ 10 3.151500
44
+ 11 3.140500
45
+ 12 3.128800
46
+ 13 3.117600
47
+ 14 3.106700
48
+ 15 3.095500
49
+ 16 3.084700
50
+ 17 3.073700
51
+ 18 3.062700
52
+ 19 3.052300
53
+ 20 3.041800
54
+
55
+
56
+ 201 1.273200
57
+ 202 1.257600
58
+ 203 1.241900
59
+ 204 1.226100
60
+ 205 1.210800
61
+ 206 1.195500
62
+ 207 1.180800
63
+ 208 1.166000
64
+ 209 1.151200
65
+ 210 1.136900
66
+ 211 1.122000
67
+ 212 1.106600
68
+ 213 1.091200
69
+ 214 1.075200
70
+ 215 1.059200
71
+ 216 1.042900
72
+ 217 1.026600
73
+ 218 1.010300
74
+ 219 0.994200
75
+
76
+ 416 0.041700
77
+ 417 0.041700
78
+ 418 0.041600
79
+ 419 0.041600
80
+ 420 0.041600
81
+ 421 0.041600
82
+ 422 0.041500
83
+ 423 0.041500
84
+ 424 0.041500
85
+ 425 0.041400
86
+ 426 0.041400
87
+ 427 0.041400
88
+ 428 0.041400
89
+ 429 0.041300
90
+ 430 0.041300
91
+ 431 0.041300
92
+ 432 0.041200
93
+ 433 0.041200
94
+ 434 0.041200
95
+ 435 0.041100
96
+ 436 0.041200
97
+ 437 0.041100
98
+ 438 0.041100
99
+ 439 0.041100
100
+ 440 0.041000
101
+ 441 0.041000
102
+ 442 0.041000
103
+ 443 0.040900
104
+ 444 0.040900
105
+ 445 0.040900
106
+
107
+ 668 0.035200
108
+ 669 0.035100
109
+ 670 0.035100
110
+ 671 0.035100
111
+ 672 0.035100
112
+ 673 0.035000
113
+ 674 0.035000
114
+ 675 0.035000
115
+ 676 0.035000
116
+ 677 0.034900
117
+ 678 0.034900
118
+ 679 0.034900
119
+ 680 0.034800
120
+ 681 0.034800
121
+ 682 0.034800
122
+ 683 0.034800
123
+ 684 0.034800
124
+ 685 0.034700
125
+ 686 0.034700
126
+ 687 0.034700
127
+ 688 0.034700
128
+ 689 0.034600
129
+ 690 0.034600
130
+ 691 0.034600
131
+ 692 0.034600
132
+ 693 0.034500
133
+ 694 0.034500
134
+ 695 0.034500
135
+ 696 0.034400
136
+ 697 0.034400
137
+ 698 0.034400
138
+ 699 0.034400
139
+ 700 0.034300
140
+ 701 0.034300
141
+ 702 0.034300
142
+ 703 0.034300
143
+ 704 0.034200
144
+ 705 0.034200
145
+ 706 0.034200
146
+ 707 0.034200
147
+ 708 0.034100
148
+ 709 0.034100
149
+ 710 0.034100
150
+ 711 0.034100
151
+ 712 0.034000
152
+ 713 0.034000
153
+ 714 0.034000
154
+ 715 0.034000
155
+ 716 0.033900
156
+ 717 0.033900
157
+ 718 0.033800
158
+ 719 0.033800
159
+ 720 0.033800
160
+ 721 0.033800
161
+
162
+ 1209 0.006600
163
+ 1210 0.006500
164
+ 1211 0.006300
165
+ 1212 0.006200
166
+ 1213 0.006100
167
+ 1214 0.006000
168
+ 1215 0.005800
169
+ 1216 0.005700
170
+ 1217 0.005600
171
+ 1218 0.005500
172
+ 1219 0.005400
173
+ 1220 0.005300
174
+ 1221 0.005100
175
+ 1222 0.004900
176
+ 1223 0.004800
177
+ 1224 0.004700
178
+ 1225 0.004600
179
+ 1226 0.004500
180
+ 1227 0.004400
181
+ 1228 0.004300
182
+ 1229 0.004200
183
+ 1230 0.004000
184
+ 1231 0.003900
185
+ 1232 0.003800
186
+ 1233 0.003700
187
+ 1234 0.003500
188
+ 1235 0.003400
189
+ 1236 0.003300
190
+ 1237 0.003200
191
+ 1238 0.003000
192
+ 1239 0.003000
193
+ 1240 0.002900
194
+ 1241 0.002800
195
+ 1242 0.002700
196
+ 1243 0.002600
197
+ 1244 0.002500
198
+ 1245 0.002400
199
+ 1246 0.002300
200
+ 1247 0.002200
201
+ 1248 0.002100
202
+ 1249 0.002000
203
+ 1250 0.001900
204
+ 1251 0.001800
205
+ 1252 0.001800
206
+ 1253 0.001700
207
+ 1254 0.001600
208
+ 1255 0.001600
209
+ 1256 0.001500
210
+ 1257 0.001400
211
+ 1258 0.001300
212
+ 1259 0.001300
213
+ 1260 0.001200
214
+ 1261 0.001200
215
+ 1262 0.001100
216
+ 1263 0.001100
217
+ 1264 0.001000
218
+ 1265 0.001000
219
+ 1266 0.000900
220
+ 1267 0.000900
221
+ 1268 0.000800
222
+ 1269 0.000800
223
+ 1270 0.000800
224
+ 1271 0.000800
225
+ 1272 0.000700
226
+ 1273 0.000700
227
+ 1274 0.000700
228
+ 1275 0.000600
229
+ 1276 0.000600
230
+ 1277 0.000600
231
+ 1278 0.000600
232
+ 1279 0.000500
233
+ 1280 0.000500
234
+ 1281 0.000500
235
+ 1282 0.000500
236
+ 1283 0.000500
237
+ 1284 0.000500
238
+ 1285 0.000500
239
+ 1286 0.000400
240
+ 1287 0.000400
241
+ 1288 0.000400
242
+ 1289 0.000400
243
+ 1290 0.000400
244
+ 1291 0.000400
245
+ 1292 0.000400
246
+ 1293 0.000400
247
+ 1294 0.000400
248
+ 1295 0.000400
249
+ 1296 0.000400
250
+ 1297 0.000300
251
+ 1298 0.000300
252
+ ```