Add benchmark for 20B ckpt
Browse files
README.md
CHANGED
@@ -62,6 +62,80 @@ Llama-2-Ko is an auto-regressive language model that uses an optimized transform
|
|
62 |
['▁L', 'l', 'ama', '▁', '2', ':', '▁Open', '▁Foundation', '▁and', '▁Fine', '-', 'T', 'un', 'ed', '▁Ch', 'at', '▁Mod', 'els']
|
63 |
```
|
64 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
---
|
66 |
|
67 |
> Below is the original model card of the Llama-2 model.
|
|
|
62 |
['▁L', 'l', 'ama', '▁', '2', ':', '▁Open', '▁Foundation', '▁and', '▁Fine', '-', 'T', 'un', 'ed', '▁Ch', 'at', '▁Mod', 'els']
|
63 |
```
|
64 |
|
65 |
+
## **Model Benchmark**
|
66 |
+
|
67 |
+
|
68 |
+
## LM Eval Harness - Korean (polyglot branch)
|
69 |
+
|
70 |
+
### NSMC (Acc) - 50000 full test
|
71 |
+
|
72 |
+
TBD
|
73 |
+
|
74 |
+
### COPA (F1)
|
75 |
+
|
76 |
+
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
77 |
+
| --- | --- | --- | --- | --- |
|
78 |
+
| https://huggingface.co/skt/ko-gpt-trinity-1.2B-v0.5 | 0.6696 | 0.6477 | 0.6419 | 0.6514 |
|
79 |
+
| https://huggingface.co/kakaobrain/kogpt | 0.7345 | 0.7287 | 0.7277 | 0.7479 |
|
80 |
+
| https://huggingface.co/facebook/xglm-7.5B | 0.6723 | 0.6731 | 0.6769 | 0.7119 |
|
81 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-1.3b | 0.7196 | 0.7193 | 0.7204 | 0.7206 |
|
82 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-3.8b | 0.7595 | 0.7608 | 0.7638 | 0.7788 |
|
83 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-5.8b | 0.7745 | 0.7676 | 0.7775 | 0.7887 |
|
84 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-12.8b | 0.7937 | 0.8108 | 0.8037 | 0.8369 |
|
85 |
+
| Llama-2 Original 7B* | 0.562033 | 0.575982 | 0.576216 | 0.595532 |
|
86 |
+
| Llama-2-Ko-7b 20B (10k) | 0.738780 | 0.762639 | 0.780761 | 0.797863 |
|
87 |
+
|
88 |
+
*Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
|
89 |
+
|
90 |
+
### HellaSwag (F1)
|
91 |
+
|
92 |
+
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
93 |
+
| --- | --- | --- | --- | --- |
|
94 |
+
| https://huggingface.co/skt/ko-gpt-trinity-1.2B-v0.5 | 0.5243 | 0.5272 | 0.5166 | 0.5352 |
|
95 |
+
| https://huggingface.co/kakaobrain/kogpt | 0.5590 | 0.5833 | 0.5828 | 0.5907 |
|
96 |
+
| https://huggingface.co/facebook/xglm-7.5B | 0.5665 | 0.5689 | 0.5565 | 0.5622 |
|
97 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-1.3b | 0.5247 | 0.5260 | 0.5278 | 0.5427 |
|
98 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-3.8b | 0.5707 | 0.5830 | 0.5670 | 0.5787 |
|
99 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-5.8b | 0.5976 | 0.5998 | 0.5979 | 0.6208 |
|
100 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-12.8b | 0.5954 | 0.6306 | 0.6098 | 0.6118 |
|
101 |
+
| Llama-2 Original 7B* | 0.415390 | 0.431382 | 0.421342 | 0.442003 |
|
102 |
+
| Llama-2-Ko-7b 20B (10k) | 0.451757 | 0.466751 | 0.472607 | 0.482776 |
|
103 |
+
|
104 |
+
*Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
|
105 |
+
|
106 |
+
### BoolQ (F1)
|
107 |
+
|
108 |
+
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
109 |
+
| --- | --- | --- | --- | --- |
|
110 |
+
| https://huggingface.co/skt/ko-gpt-trinity-1.2B-v0.5 | 0.3356 | 0.4014 | 0.3640 | 0.3560 |
|
111 |
+
| https://huggingface.co/kakaobrain/kogpt | 0.4514 | 0.5981 | 0.5499 | 0.5202 |
|
112 |
+
| https://huggingface.co/facebook/xglm-7.5B | 0.4464 | 0.3324 | 0.3324 | 0.3324 |
|
113 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-1.3b | 0.3552 | 0.4751 | 0.4109 | 0.4038 |
|
114 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-3.8b | 0.4320 | 0.5263 | 0.4930 | 0.4038 |
|
115 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-5.8b | 0.4356 | 0.5698 | 0.5187 | 0.5236 |
|
116 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-12.8b | 0.4818 | 0.6041 | 0.6289 | 0.6448 |
|
117 |
+
| Llama-2 Original 7B* | 0.352050 | 0.563238 | 0.474788 | 0.419222 |
|
118 |
+
| Llama-2-Ko-7b 20B (10k) | 0.360656 | 0.679743 | 0.680109 | 0.662152 |
|
119 |
+
|
120 |
+
*Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
|
121 |
+
|
122 |
+
### SentiNeg (F1)
|
123 |
+
|
124 |
+
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
125 |
+
| --- | --- | --- | --- | --- |
|
126 |
+
| https://huggingface.co/skt/ko-gpt-trinity-1.2B-v0.5 | 0.6065 | 0.6878 | 0.7280 | 0.8413 |
|
127 |
+
| https://huggingface.co/kakaobrain/kogpt | 0.3747 | 0.8942 | 0.9294 | 0.9698 |
|
128 |
+
| https://huggingface.co/facebook/xglm-7.5B | 0.3578 | 0.4471 | 0.3964 | 0.5271 |
|
129 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-1.3b | 0.6790 | 0.6257 | 0.5514 | 0.7851 |
|
130 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-3.8b | 0.4858 | 0.7950 | 0.7320 | 0.7851 |
|
131 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-5.8b | 0.3394 | 0.8841 | 0.8808 | 0.9521 |
|
132 |
+
| https://huggingface.co/EleutherAI/polyglot-ko-12.8b | 0.9117 | 0.9015 | 0.9345 | 0.9723 |
|
133 |
+
| Llama-2 Original 7B* | 0.347502 | 0.529124 | 0.480641 | 0.788457 |
|
134 |
+
| Llama-2-Ko-7b 20B (10k) | 0.485546 | 0.829503 | 0.871141 | 0.851253 |
|
135 |
+
|
136 |
+
*Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
|
137 |
+
|
138 |
+
|
139 |
---
|
140 |
|
141 |
> Below is the original model card of the Llama-2 model.
|