leaderboard-pr-bot commited on
Commit
b5964c3
1 Parent(s): 7be5369

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +118 -2
README.md CHANGED
@@ -1,12 +1,128 @@
1
  ---
2
  license: apache-2.0
 
3
  datasets:
4
  - netcat420/MFANN
5
- library_name: adapter-transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  ---
7
 
8
  Fine-tuned on an expansive database comprising over 2.5 million tokens meticulously structured for chain of thought reasoning, MFANN emerges as a powerhouse in understanding and generating coherent, contextually rich text. Its robust architecture enables it to seamlessly navigate complex linguistic nuances, adeptly chaining together ideas to produce fluid, human-like discourse.
9
 
10
  MFANN's exceptional capacity for reasoning shines through in its ability to grasp intricate concepts and synthesize information cohesively. Whether tackling intricate philosophical debates, crafting compelling narratives, or generating insightful analyses, MFANN consistently delivers results that are both insightful and compelling.
11
 
12
- Empowered by its massive parameter count and rigorous fine-tuning process, MFANN excels in a myriad of applications, from natural language understanding and generation to content creation and dialogue systems. Its versatility and proficiency make it an invaluable tool across various domains, empowering researchers, developers, and innovators alike to push the boundaries of what's possible in artificial intelligence.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ library_name: adapter-transformers
4
  datasets:
5
  - netcat420/MFANN
6
+ model-index:
7
+ - name: MFANN3b
8
+ results:
9
+ - task:
10
+ type: text-generation
11
+ name: Text Generation
12
+ dataset:
13
+ name: AI2 Reasoning Challenge (25-Shot)
14
+ type: ai2_arc
15
+ config: ARC-Challenge
16
+ split: test
17
+ args:
18
+ num_few_shot: 25
19
+ metrics:
20
+ - type: acc_norm
21
+ value: 43.09
22
+ name: normalized accuracy
23
+ source:
24
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=netcat420/MFANN3b
25
+ name: Open LLM Leaderboard
26
+ - task:
27
+ type: text-generation
28
+ name: Text Generation
29
+ dataset:
30
+ name: HellaSwag (10-Shot)
31
+ type: hellaswag
32
+ split: validation
33
+ args:
34
+ num_few_shot: 10
35
+ metrics:
36
+ - type: acc_norm
37
+ value: 72.33
38
+ name: normalized accuracy
39
+ source:
40
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=netcat420/MFANN3b
41
+ name: Open LLM Leaderboard
42
+ - task:
43
+ type: text-generation
44
+ name: Text Generation
45
+ dataset:
46
+ name: MMLU (5-Shot)
47
+ type: cais/mmlu
48
+ config: all
49
+ split: test
50
+ args:
51
+ num_few_shot: 5
52
+ metrics:
53
+ - type: acc
54
+ value: 26.74
55
+ name: accuracy
56
+ source:
57
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=netcat420/MFANN3b
58
+ name: Open LLM Leaderboard
59
+ - task:
60
+ type: text-generation
61
+ name: Text Generation
62
+ dataset:
63
+ name: TruthfulQA (0-shot)
64
+ type: truthful_qa
65
+ config: multiple_choice
66
+ split: validation
67
+ args:
68
+ num_few_shot: 0
69
+ metrics:
70
+ - type: mc2
71
+ value: 40.22
72
+ source:
73
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=netcat420/MFANN3b
74
+ name: Open LLM Leaderboard
75
+ - task:
76
+ type: text-generation
77
+ name: Text Generation
78
+ dataset:
79
+ name: Winogrande (5-shot)
80
+ type: winogrande
81
+ config: winogrande_xl
82
+ split: validation
83
+ args:
84
+ num_few_shot: 5
85
+ metrics:
86
+ - type: acc
87
+ value: 62.67
88
+ name: accuracy
89
+ source:
90
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=netcat420/MFANN3b
91
+ name: Open LLM Leaderboard
92
+ - task:
93
+ type: text-generation
94
+ name: Text Generation
95
+ dataset:
96
+ name: GSM8k (5-shot)
97
+ type: gsm8k
98
+ config: main
99
+ split: test
100
+ args:
101
+ num_few_shot: 5
102
+ metrics:
103
+ - type: acc
104
+ value: 3.34
105
+ name: accuracy
106
+ source:
107
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=netcat420/MFANN3b
108
+ name: Open LLM Leaderboard
109
  ---
110
 
111
  Fine-tuned on an expansive database comprising over 2.5 million tokens meticulously structured for chain of thought reasoning, MFANN emerges as a powerhouse in understanding and generating coherent, contextually rich text. Its robust architecture enables it to seamlessly navigate complex linguistic nuances, adeptly chaining together ideas to produce fluid, human-like discourse.
112
 
113
  MFANN's exceptional capacity for reasoning shines through in its ability to grasp intricate concepts and synthesize information cohesively. Whether tackling intricate philosophical debates, crafting compelling narratives, or generating insightful analyses, MFANN consistently delivers results that are both insightful and compelling.
114
 
115
+ Empowered by its massive parameter count and rigorous fine-tuning process, MFANN excels in a myriad of applications, from natural language understanding and generation to content creation and dialogue systems. Its versatility and proficiency make it an invaluable tool across various domains, empowering researchers, developers, and innovators alike to push the boundaries of what's possible in artificial intelligence.
116
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
117
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_netcat420__MFANN3b)
118
+
119
+ | Metric |Value|
120
+ |---------------------------------|----:|
121
+ |Avg. |41.40|
122
+ |AI2 Reasoning Challenge (25-Shot)|43.09|
123
+ |HellaSwag (10-Shot) |72.33|
124
+ |MMLU (5-Shot) |26.74|
125
+ |TruthfulQA (0-shot) |40.22|
126
+ |Winogrande (5-shot) |62.67|
127
+ |GSM8k (5-shot) | 3.34|
128
+