leaderboard-pr-bot
commited on
Commit
•
b5964c3
1
Parent(s):
7be5369
Adding Evaluation Results
Browse filesThis is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr
The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.
If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions
README.md
CHANGED
@@ -1,12 +1,128 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
3 |
datasets:
|
4 |
- netcat420/MFANN
|
5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
---
|
7 |
|
8 |
Fine-tuned on an expansive database comprising over 2.5 million tokens meticulously structured for chain of thought reasoning, MFANN emerges as a powerhouse in understanding and generating coherent, contextually rich text. Its robust architecture enables it to seamlessly navigate complex linguistic nuances, adeptly chaining together ideas to produce fluid, human-like discourse.
|
9 |
|
10 |
MFANN's exceptional capacity for reasoning shines through in its ability to grasp intricate concepts and synthesize information cohesively. Whether tackling intricate philosophical debates, crafting compelling narratives, or generating insightful analyses, MFANN consistently delivers results that are both insightful and compelling.
|
11 |
|
12 |
-
Empowered by its massive parameter count and rigorous fine-tuning process, MFANN excels in a myriad of applications, from natural language understanding and generation to content creation and dialogue systems. Its versatility and proficiency make it an invaluable tool across various domains, empowering researchers, developers, and innovators alike to push the boundaries of what's possible in artificial intelligence.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
library_name: adapter-transformers
|
4 |
datasets:
|
5 |
- netcat420/MFANN
|
6 |
+
model-index:
|
7 |
+
- name: MFANN3b
|
8 |
+
results:
|
9 |
+
- task:
|
10 |
+
type: text-generation
|
11 |
+
name: Text Generation
|
12 |
+
dataset:
|
13 |
+
name: AI2 Reasoning Challenge (25-Shot)
|
14 |
+
type: ai2_arc
|
15 |
+
config: ARC-Challenge
|
16 |
+
split: test
|
17 |
+
args:
|
18 |
+
num_few_shot: 25
|
19 |
+
metrics:
|
20 |
+
- type: acc_norm
|
21 |
+
value: 43.09
|
22 |
+
name: normalized accuracy
|
23 |
+
source:
|
24 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=netcat420/MFANN3b
|
25 |
+
name: Open LLM Leaderboard
|
26 |
+
- task:
|
27 |
+
type: text-generation
|
28 |
+
name: Text Generation
|
29 |
+
dataset:
|
30 |
+
name: HellaSwag (10-Shot)
|
31 |
+
type: hellaswag
|
32 |
+
split: validation
|
33 |
+
args:
|
34 |
+
num_few_shot: 10
|
35 |
+
metrics:
|
36 |
+
- type: acc_norm
|
37 |
+
value: 72.33
|
38 |
+
name: normalized accuracy
|
39 |
+
source:
|
40 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=netcat420/MFANN3b
|
41 |
+
name: Open LLM Leaderboard
|
42 |
+
- task:
|
43 |
+
type: text-generation
|
44 |
+
name: Text Generation
|
45 |
+
dataset:
|
46 |
+
name: MMLU (5-Shot)
|
47 |
+
type: cais/mmlu
|
48 |
+
config: all
|
49 |
+
split: test
|
50 |
+
args:
|
51 |
+
num_few_shot: 5
|
52 |
+
metrics:
|
53 |
+
- type: acc
|
54 |
+
value: 26.74
|
55 |
+
name: accuracy
|
56 |
+
source:
|
57 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=netcat420/MFANN3b
|
58 |
+
name: Open LLM Leaderboard
|
59 |
+
- task:
|
60 |
+
type: text-generation
|
61 |
+
name: Text Generation
|
62 |
+
dataset:
|
63 |
+
name: TruthfulQA (0-shot)
|
64 |
+
type: truthful_qa
|
65 |
+
config: multiple_choice
|
66 |
+
split: validation
|
67 |
+
args:
|
68 |
+
num_few_shot: 0
|
69 |
+
metrics:
|
70 |
+
- type: mc2
|
71 |
+
value: 40.22
|
72 |
+
source:
|
73 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=netcat420/MFANN3b
|
74 |
+
name: Open LLM Leaderboard
|
75 |
+
- task:
|
76 |
+
type: text-generation
|
77 |
+
name: Text Generation
|
78 |
+
dataset:
|
79 |
+
name: Winogrande (5-shot)
|
80 |
+
type: winogrande
|
81 |
+
config: winogrande_xl
|
82 |
+
split: validation
|
83 |
+
args:
|
84 |
+
num_few_shot: 5
|
85 |
+
metrics:
|
86 |
+
- type: acc
|
87 |
+
value: 62.67
|
88 |
+
name: accuracy
|
89 |
+
source:
|
90 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=netcat420/MFANN3b
|
91 |
+
name: Open LLM Leaderboard
|
92 |
+
- task:
|
93 |
+
type: text-generation
|
94 |
+
name: Text Generation
|
95 |
+
dataset:
|
96 |
+
name: GSM8k (5-shot)
|
97 |
+
type: gsm8k
|
98 |
+
config: main
|
99 |
+
split: test
|
100 |
+
args:
|
101 |
+
num_few_shot: 5
|
102 |
+
metrics:
|
103 |
+
- type: acc
|
104 |
+
value: 3.34
|
105 |
+
name: accuracy
|
106 |
+
source:
|
107 |
+
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=netcat420/MFANN3b
|
108 |
+
name: Open LLM Leaderboard
|
109 |
---
|
110 |
|
111 |
Fine-tuned on an expansive database comprising over 2.5 million tokens meticulously structured for chain of thought reasoning, MFANN emerges as a powerhouse in understanding and generating coherent, contextually rich text. Its robust architecture enables it to seamlessly navigate complex linguistic nuances, adeptly chaining together ideas to produce fluid, human-like discourse.
|
112 |
|
113 |
MFANN's exceptional capacity for reasoning shines through in its ability to grasp intricate concepts and synthesize information cohesively. Whether tackling intricate philosophical debates, crafting compelling narratives, or generating insightful analyses, MFANN consistently delivers results that are both insightful and compelling.
|
114 |
|
115 |
+
Empowered by its massive parameter count and rigorous fine-tuning process, MFANN excels in a myriad of applications, from natural language understanding and generation to content creation and dialogue systems. Its versatility and proficiency make it an invaluable tool across various domains, empowering researchers, developers, and innovators alike to push the boundaries of what's possible in artificial intelligence.
|
116 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
117 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_netcat420__MFANN3b)
|
118 |
+
|
119 |
+
| Metric |Value|
|
120 |
+
|---------------------------------|----:|
|
121 |
+
|Avg. |41.40|
|
122 |
+
|AI2 Reasoning Challenge (25-Shot)|43.09|
|
123 |
+
|HellaSwag (10-Shot) |72.33|
|
124 |
+
|MMLU (5-Shot) |26.74|
|
125 |
+
|TruthfulQA (0-shot) |40.22|
|
126 |
+
|Winogrande (5-shot) |62.67|
|
127 |
+
|GSM8k (5-shot) | 3.34|
|
128 |
+
|