Weyaxi commited on
Commit
5838f49
1 Parent(s): 187846d

good readme

Browse files
Files changed (1) hide show
  1. README.md +126 -49
README.md CHANGED
@@ -1,18 +1,74 @@
1
  ---
2
- license: apache-2.0
3
- base_model: Qwen/Qwen2-7B
 
4
  tags:
5
  - axolotl
6
- - generated_from_trainer
7
- model-index:
8
- - name: Einstein-v7-Qwen2-7B
9
- results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
 
11
 
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
14
 
15
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
16
  <details><summary>See axolotl config</summary>
17
 
18
  axolotl version: `0.4.0`
@@ -178,62 +234,83 @@ special_tokens:
178
  tokens:
179
  - "<|im_start|>"
180
  - "<|im_end|>"
181
-
182
  ```
183
 
184
  </details><br>
185
 
186
- # Einstein-v7-Qwen2-7B
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
187
 
188
- This model is a fine-tuned version of [Qwen/Qwen2-7B](https://huggingface.co/Qwen/Qwen2-7B) on the None dataset.
189
- It achieves the following results on the evaluation set:
190
- - Loss: 0.6983
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
191
 
192
- ## Model description
193
 
194
- More information needed
195
 
196
- ## Intended uses & limitations
197
 
198
- More information needed
199
 
200
- ## Training and evaluation data
201
 
202
- More information needed
203
 
204
- ## Training procedure
205
 
206
- ### Training hyperparameters
207
 
208
- The following hyperparameters were used during training:
209
- - learning_rate: 1e-05
210
- - train_batch_size: 6
211
- - eval_batch_size: 6
212
- - seed: 42
213
- - distributed_type: multi-GPU
214
- - num_devices: 8
215
- - gradient_accumulation_steps: 4
216
- - total_train_batch_size: 192
217
- - total_eval_batch_size: 48
218
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
219
- - lr_scheduler_type: cosine
220
- - lr_scheduler_warmup_steps: 10
221
- - num_epochs: 2
222
 
223
- ### Training results
224
 
225
- | Training Loss | Epoch | Step | Validation Loss |
226
- |:-------------:|:-----:|:----:|:---------------:|
227
- | 0.9189 | 0.0 | 1 | 0.8840 |
228
- | 0.7368 | 0.5 | 125 | 0.7193 |
229
- | 0.7406 | 1.0 | 250 | 0.7037 |
230
- | 0.6593 | 1.48 | 375 | 0.6996 |
231
- | 0.6754 | 1.97 | 500 | 0.6983 |
232
 
 
 
 
 
 
233
 
234
- ### Framework versions
235
 
236
- - Transformers 4.40.0.dev0
237
- - Pytorch 2.4.0.dev20240508+rocm6.1
238
- - Datasets 2.15.0
239
- - Tokenizers 0.15.0
 
1
  ---
2
+ language:
3
+ - en
4
+ license: other
5
  tags:
6
  - axolotl
7
+ - instruct
8
+ - finetune
9
+ - chatml
10
+ - gpt4
11
+ - synthetic data
12
+ - science
13
+ - physics
14
+ - chemistry
15
+ - biology
16
+ - math
17
+ - qwen
18
+ - qwen2
19
+ base_model: Qwen/Qwen2-7B
20
+ datasets:
21
+ - allenai/ai2_arc
22
+ - camel-ai/physics
23
+ - camel-ai/chemistry
24
+ - camel-ai/biology
25
+ - camel-ai/math
26
+ - metaeval/reclor
27
+ - openbookqa
28
+ - mandyyyyii/scibench
29
+ - derek-thomas/ScienceQA
30
+ - TIGER-Lab/ScienceEval
31
+ - jondurbin/airoboros-3.2
32
+ - LDJnr/Capybara
33
+ - Cot-Alpaca-GPT4-From-OpenHermes-2.5
34
+ - STEM-AI-mtl/Electrical-engineering
35
+ - knowrohit07/saraswati-stem
36
+ - sablo/oasst2_curated
37
+ - lmsys/lmsys-chat-1m
38
+ - TIGER-Lab/MathInstruct
39
+ - bigbio/med_qa
40
+ - meta-math/MetaMathQA-40K
41
+ - openbookqa
42
+ - piqa
43
+ - metaeval/reclor
44
+ - derek-thomas/ScienceQA
45
+ - scibench
46
+ - sciq
47
+ - Open-Orca/SlimOrca
48
+ - migtissera/Synthia-v1.3
49
+ - TIGER-Lab/ScienceEval
50
+ - allenai/WildChat
51
+ - microsoft/orca-math-word-problems-200k
52
+ - openchat/openchat_sharegpt4_dataset
53
+ - teknium/GPTeacher-General-Instruct
54
+ - m-a-p/CodeFeedback-Filtered-Instruction
55
+ - totally-not-an-llm/EverythingLM-data-V3
56
+ - HuggingFaceH4/no_robots
57
+ - OpenAssistant/oasst_top1_2023-08-25
58
+ - WizardLM/WizardLM_evol_instruct_70k
59
+ - abacusai/SystemChat-1.1
60
+ - H-D-T/Buzz-V1.2
61
  ---
62
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/IoQXRzbtKFTTyS-SOegKe.png)
63
 
64
+ # 🔬 Einstein-v7-Qwen2-7B
65
+
66
+ This model is a full fine-tuned version of [Qwen/Qwen2-7B](https://huggingface.co/Qwen/Qwen2-7B) on diverse datasets.
67
+
68
+ This model is finetuned using `8xMI300X` using [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl).
69
+
70
+ This model's training was sponsored by [sponsor](https://sponsor).
71
 
 
72
  <details><summary>See axolotl config</summary>
73
 
74
  axolotl version: `0.4.0`
 
234
  tokens:
235
  - "<|im_start|>"
236
  - "<|im_end|>"
 
237
  ```
238
 
239
  </details><br>
240
 
241
+ # 💬 Prompt Template
242
+
243
+ You can use ChatML prompt template while using the model:
244
+
245
+ ### ChatML
246
+
247
+ ```
248
+ <|im_start|>system
249
+ {system}<|im_end|>
250
+ <|im_start|>user
251
+ {user}<|im_end|>
252
+ <|im_start|>assistant
253
+ {asistant}<|im_end|>
254
+ ```
255
+
256
+ This prompt template is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the
257
+ `tokenizer.apply_chat_template()` method:
258
 
259
+ ```python
260
+ messages = [
261
+ {"role": "system", "content": "You are helpful AI asistant."},
262
+ {"role": "user", "content": "Hello!"}
263
+ ]
264
+ gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
265
+ model.generate(**gen_input)
266
+ ```
267
+
268
+ # 📊 Datasets used in this model
269
+
270
+ The datasets used to train this model are listed in the metadata section of the model card.
271
+
272
+ Please note that certain datasets mentioned in the metadata may have undergone filtering based on various criteria.
273
+
274
+ The results of this filtering process and its outcomes are in the data folder of this repository:
275
+
276
+ [Weyaxi/Einstein-v7-Qwen2-7B/data](https://huggingface.co/Weyaxi/Einstein-v7-Qwen2-7B/tree/main/data)
277
+
278
+ # 🔄 Quantizationed versions
279
 
280
+ ## GGUF [@bartowski](https://huggingface.co/bartowski)
281
 
282
+ - https://huggingface.co/bartowski/Einstein-v7-Qwen2-7B-GGUF
283
 
284
+ ## ExLlamaV2 [@bartowski](https://huggingface.co/bartowski)
285
 
286
+ - https://huggingface.co/bartowski/Einstein-v7-Qwen2-7B-exl2
287
 
288
+ # 🎯 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
289
 
290
+ # 🤖 Additional information about training
291
 
292
+ This model is full fine-tuned for 2 epoch.
293
 
294
+ Total number of steps was 500.
295
 
296
+ <details><summary>Loss graph</summary>
 
 
 
 
 
 
 
 
 
 
 
 
 
297
 
298
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/bkJGgh_JUfKeRlTLo_ZcB.png)
299
 
300
+ </details><br>
301
+
302
+ # 🤝 Acknowledgments
303
+
304
+ Thanks to [sponsor](https://sponsor) for sponsoring this model.
305
+
306
+ Thanks to all the dataset authors mentioned in the datasets section.
307
 
308
+ Thanks to [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) for making the repository I used to make this model.
309
+
310
+ Thanks to all open source AI community.
311
+
312
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
313
 
314
+ If you would like to support me:
315
 
316
+ [☕ Buy Me a Coffee](https://www.buymeacoffee.com/weyaxi)