Daemontatox commited on
Commit
7f63536
·
verified ·
1 Parent(s): 03d0ed0

Adding Evaluation Results

Browse files

This is an automated PR created with [this space](https://huggingface.co/spaces/T145/open-llm-leaderboard-results-to-modelcard)!

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

Please report any issues here: https://huggingface.co/spaces/T145/open-llm-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +114 -1
README.md CHANGED
@@ -17,6 +17,105 @@ datasets:
17
  - gghfez/QwQ-LongCoT-130K-cleaned
18
  pipeline_tag: text-generation
19
  library_name: transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ---
21
  ![image](./image.webp)
22
 
@@ -99,4 +198,18 @@ CogniLink is available for download and deployment. Start integrating advanced r
99
 
100
  For inquiries, contributions, or support, visit **[Unsloth GitHub](https://github.com/unslothai/unsloth)**.
101
 
102
- **CogniLink: Connecting Intelligence with Clarity.**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  - gghfez/QwQ-LongCoT-130K-cleaned
18
  pipeline_tag: text-generation
19
  library_name: transformers
20
+ model-index:
21
+ - name: Llama3.3-70B-CogniLink
22
+ results:
23
+ - task:
24
+ type: text-generation
25
+ name: Text Generation
26
+ dataset:
27
+ name: IFEval (0-Shot)
28
+ type: wis-k/instruction-following-eval
29
+ split: train
30
+ args:
31
+ num_few_shot: 0
32
+ metrics:
33
+ - type: inst_level_strict_acc and prompt_level_strict_acc
34
+ value: 69.31
35
+ name: averaged accuracy
36
+ source:
37
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FLlama3.3-70B-CogniLink
38
+ name: Open LLM Leaderboard
39
+ - task:
40
+ type: text-generation
41
+ name: Text Generation
42
+ dataset:
43
+ name: BBH (3-Shot)
44
+ type: SaylorTwift/bbh
45
+ split: test
46
+ args:
47
+ num_few_shot: 3
48
+ metrics:
49
+ - type: acc_norm
50
+ value: 52.12
51
+ name: normalized accuracy
52
+ source:
53
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FLlama3.3-70B-CogniLink
54
+ name: Open LLM Leaderboard
55
+ - task:
56
+ type: text-generation
57
+ name: Text Generation
58
+ dataset:
59
+ name: MATH Lvl 5 (4-Shot)
60
+ type: lighteval/MATH-Hard
61
+ split: test
62
+ args:
63
+ num_few_shot: 4
64
+ metrics:
65
+ - type: exact_match
66
+ value: 39.58
67
+ name: exact match
68
+ source:
69
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FLlama3.3-70B-CogniLink
70
+ name: Open LLM Leaderboard
71
+ - task:
72
+ type: text-generation
73
+ name: Text Generation
74
+ dataset:
75
+ name: GPQA (0-shot)
76
+ type: Idavidrein/gpqa
77
+ split: train
78
+ args:
79
+ num_few_shot: 0
80
+ metrics:
81
+ - type: acc_norm
82
+ value: 26.06
83
+ name: acc_norm
84
+ source:
85
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FLlama3.3-70B-CogniLink
86
+ name: Open LLM Leaderboard
87
+ - task:
88
+ type: text-generation
89
+ name: Text Generation
90
+ dataset:
91
+ name: MuSR (0-shot)
92
+ type: TAUR-Lab/MuSR
93
+ args:
94
+ num_few_shot: 0
95
+ metrics:
96
+ - type: acc_norm
97
+ value: 21.4
98
+ name: acc_norm
99
+ source:
100
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FLlama3.3-70B-CogniLink
101
+ name: Open LLM Leaderboard
102
+ - task:
103
+ type: text-generation
104
+ name: Text Generation
105
+ dataset:
106
+ name: MMLU-PRO (5-shot)
107
+ type: TIGER-Lab/MMLU-Pro
108
+ config: main
109
+ split: test
110
+ args:
111
+ num_few_shot: 5
112
+ metrics:
113
+ - type: acc
114
+ value: 46.37
115
+ name: accuracy
116
+ source:
117
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Daemontatox%2FLlama3.3-70B-CogniLink
118
+ name: Open LLM Leaderboard
119
  ---
120
  ![image](./image.webp)
121
 
 
198
 
199
  For inquiries, contributions, or support, visit **[Unsloth GitHub](https://github.com/unslothai/unsloth)**.
200
 
201
+ **CogniLink: Connecting Intelligence with Clarity.**
202
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
203
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Daemontatox__Llama3.3-70B-CogniLink-details)!
204
+ Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=Daemontatox%2FLlama3.3-70B-CogniLink&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!
205
+
206
+ | Metric |Value (%)|
207
+ |-------------------|--------:|
208
+ |**Average** | 42.47|
209
+ |IFEval (0-Shot) | 69.31|
210
+ |BBH (3-Shot) | 52.12|
211
+ |MATH Lvl 5 (4-Shot)| 39.58|
212
+ |GPQA (0-shot) | 26.06|
213
+ |MuSR (0-shot) | 21.40|
214
+ |MMLU-PRO (5-shot) | 46.37|
215
+