csabakecskemeti commited on
Commit
83b2a7e
·
verified ·
1 Parent(s): 65b994e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +6 -113
README.md CHANGED
@@ -1,119 +1,12 @@
1
  ---
2
- license: llama3.2
3
- datasets:
4
- - microsoft/orca-agentinstruct-1M-v1
5
- pipeline_tag: text-generation
6
- library_name: transformers
7
  base_model:
8
- - DevQuasar/analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit
9
- model-index:
10
- - name: analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit
11
- results:
12
- - task:
13
- type: text-generation
14
- dataset:
15
- type: lm-evaluation-harness
16
- name: bbh
17
- metrics:
18
- - name: acc_norm
19
- type: acc_norm
20
- value: 0.4168
21
- verified: false
22
- - task:
23
- type: text-generation
24
- dataset:
25
- type: lm-evaluation-harness
26
- name: gpqa
27
- metrics:
28
- - name: acc_norm
29
- type: acc_norm
30
- value: 0.2691
31
- verified: false
32
- - task:
33
- type: text-generation
34
- dataset:
35
- type: lm-evaluation-harness
36
- name: math
37
- metrics:
38
- - name: exact_match
39
- type: exact_match
40
- value: 0.0867
41
- verified: false
42
- - task:
43
- type: text-generation
44
- dataset:
45
- type: lm-evaluation-harness
46
- name: mmlu
47
- metrics:
48
- - name: acc_norm
49
- type: acc_norm
50
- value: 0.2822
51
- verified: false
52
- - task:
53
- type: text-generation
54
- dataset:
55
- type: lm-evaluation-harness
56
- name: musr
57
- metrics:
58
- - name: acc_norm
59
- type: acc_norm
60
- value: 0.3648
61
- verified: false
62
- - task:
63
- type: text-generation
64
- dataset:
65
- type: lm-evaluation-harness
66
- name: hellaswag
67
- metrics:
68
- - name: acc
69
- type: acc
70
- value: 0.5141
71
- verified: false
72
- - name: acc_norm
73
- type: acc_norm
74
- value: 0.6793
75
- verified: false
76
-
77
  ---
78
 
79
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e6d37e02dee9bcb9d9fa18/X4WG8AnMFqJuWkRvA0CrW.png)
80
-
81
- ### Eval
82
-
83
- The fine tuned model (DevQuasar/analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit)
84
- has gained performace over the base model (unsloth/Llama-3.2-3B-Instruct-bnb-4bit)
85
- in the following tasks.
86
-
87
- | Test | Base Model | Fine-Tuned Model | Performance Gain |
88
- |---|---|---|---|
89
- | leaderboard_bbh_logical_deduction_seven_objects | 0.2520 | 0.4360 | 0.1840 |
90
- | leaderboard_bbh_logical_deduction_five_objects | 0.3560 | 0.4560 | 0.1000 |
91
- | leaderboard_musr_team_allocation | 0.2200 | 0.3200 | 0.1000 |
92
- | leaderboard_bbh_disambiguation_qa | 0.3040 | 0.3760 | 0.0720 |
93
- | leaderboard_gpqa_diamond | 0.2222 | 0.2727 | 0.0505 |
94
- | leaderboard_bbh_movie_recommendation | 0.5960 | 0.6360 | 0.0400 |
95
- | leaderboard_bbh_formal_fallacies | 0.5080 | 0.5400 | 0.0320 |
96
- | leaderboard_bbh_tracking_shuffled_objects_three_objects | 0.3160 | 0.3440 | 0.0280 |
97
- | leaderboard_bbh_causal_judgement | 0.5455 | 0.5668 | 0.0214 |
98
- | leaderboard_bbh_web_of_lies | 0.4960 | 0.5160 | 0.0200 |
99
- | leaderboard_math_geometry_hard | 0.0455 | 0.0606 | 0.0152 |
100
- | leaderboard_math_num_theory_hard | 0.0519 | 0.0649 | 0.0130 |
101
- | leaderboard_musr_murder_mysteries | 0.5280 | 0.5400 | 0.0120 |
102
- | leaderboard_gpqa_extended | 0.2711 | 0.2802 | 0.0092 |
103
- | leaderboard_bbh_sports_understanding | 0.5960 | 0.6040 | 0.0080 |
104
- | leaderboard_math_intermediate_algebra_hard | 0.0107 | 0.0143 | 0.0036 |
105
-
106
-
107
- ### Framework versions
108
-
109
- - unsloth 2024.11.5
110
- - trl 0.12.0
111
-
112
- ### Training HW
113
- - V100
114
-
115
- I'm doing this to 'Make knowledge free for everyone', using my personal time and resources.
116
 
117
- If you want to support my efforts please visit my ko-fi page: https://ko-fi.com/devquasar
118
 
119
- Also feel free to visit my website https://devquasar.com/
 
 
1
  ---
 
 
 
 
 
2
  base_model:
3
+ - analytical_reasoning_r16a32_unsloth-Llama-3/2-3B-Instruct-bnb-4bit
4
+ pipeline_tag: text-generation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  ---
6
 
7
+ [<img src="https://raw.githubusercontent.com/csabakecskemeti/devquasar/main/dq_logo_black-transparent.png" width="200"/>](https://devquasar.com)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
+ 'Make knowledge free for everyone'
10
 
11
+ Quantized version of: [analytical_reasoning_r16a32_unsloth-Llama-3/2-3B-Instruct-bnb-4bit](https://huggingface.co/analytical_reasoning_r16a32_unsloth-Llama-3/2-3B-Instruct-bnb-4bit)
12
+ <a href='https://ko-fi.com/L4L416YX7C' target='_blank'><img height='36' style='border:0px;height:36px;' src='https://storage.ko-fi.com/cdn/kofi6.png?v=6' border='0' alt='Buy Me a Coffee at ko-fi.com' /></a>