Update benchmark scores with Llama-2-Ko-7b 40B (20k)
Browse files
README.md
CHANGED
@@ -75,7 +75,7 @@ TBD
|
|
75 |
|
76 |
### COPA (F1)
|
77 |
|
78 |
-
<img src=https://user-images.githubusercontent.com/11323660/
|
79 |
|
80 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
81 |
| --- | --- | --- | --- | --- |
|
@@ -88,12 +88,13 @@ TBD
|
|
88 |
| https://huggingface.co/EleutherAI/polyglot-ko-12.8b | 0.7937 | 0.8108 | 0.8037 | 0.8369 |
|
89 |
| Llama-2 Original 7B* | 0.562033 | 0.575982 | 0.576216 | 0.595532 |
|
90 |
| Llama-2-Ko-7b 20B (10k) | 0.738780 | 0.762639 | 0.780761 | 0.797863 |
|
|
|
91 |
|
92 |
*Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
|
93 |
|
94 |
### HellaSwag (F1)
|
95 |
|
96 |
-
<img src=https://user-images.githubusercontent.com/11323660/
|
97 |
|
98 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
99 |
| --- | --- | --- | --- | --- |
|
@@ -106,12 +107,13 @@ TBD
|
|
106 |
| https://huggingface.co/EleutherAI/polyglot-ko-12.8b | 0.5954 | 0.6306 | 0.6098 | 0.6118 |
|
107 |
| Llama-2 Original 7B* | 0.415390 | 0.431382 | 0.421342 | 0.442003 |
|
108 |
| Llama-2-Ko-7b 20B (10k) | 0.451757 | 0.466751 | 0.472607 | 0.482776 |
|
|
|
109 |
|
110 |
*Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
|
111 |
|
112 |
### BoolQ (F1)
|
113 |
|
114 |
-
<img src=https://user-images.githubusercontent.com/11323660/
|
115 |
|
116 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
117 |
| --- | --- | --- | --- | --- |
|
@@ -124,12 +126,13 @@ TBD
|
|
124 |
| https://huggingface.co/EleutherAI/polyglot-ko-12.8b | 0.4818 | 0.6041 | 0.6289 | 0.6448 |
|
125 |
| Llama-2 Original 7B* | 0.352050 | 0.563238 | 0.474788 | 0.419222 |
|
126 |
| Llama-2-Ko-7b 20B (10k) | 0.360656 | 0.679743 | 0.680109 | 0.662152 |
|
|
|
127 |
|
128 |
*Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
|
129 |
|
130 |
### SentiNeg (F1)
|
131 |
|
132 |
-
<img src=https://user-images.githubusercontent.com/11323660/
|
133 |
|
134 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
135 |
| --- | --- | --- | --- | --- |
|
@@ -142,7 +145,7 @@ TBD
|
|
142 |
| https://huggingface.co/EleutherAI/polyglot-ko-12.8b | 0.9117 | 0.9015 | 0.9345 | 0.9723 |
|
143 |
| Llama-2 Original 7B* | 0.347502 | 0.529124 | 0.480641 | 0.788457 |
|
144 |
| Llama-2-Ko-7b 20B (10k) | 0.485546 | 0.829503 | 0.871141 | 0.851253 |
|
145 |
-
|
146 |
*Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
|
147 |
|
148 |
|
|
|
75 |
|
76 |
### COPA (F1)
|
77 |
|
78 |
+
<img src=https://user-images.githubusercontent.com/11323660/255575809-c037bc6e-0566-436a-a6c1-2329ac92187a.png style="max-width: 700px; width: 100%" />
|
79 |
|
80 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
81 |
| --- | --- | --- | --- | --- |
|
|
|
88 |
| https://huggingface.co/EleutherAI/polyglot-ko-12.8b | 0.7937 | 0.8108 | 0.8037 | 0.8369 |
|
89 |
| Llama-2 Original 7B* | 0.562033 | 0.575982 | 0.576216 | 0.595532 |
|
90 |
| Llama-2-Ko-7b 20B (10k) | 0.738780 | 0.762639 | 0.780761 | 0.797863 |
|
91 |
+
| Llama-2-Ko-7b 40B (20k) | 0.743630 | 0.792716 | 0.803746 | 0.825944 |
|
92 |
|
93 |
*Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
|
94 |
|
95 |
### HellaSwag (F1)
|
96 |
|
97 |
+
<img src=https://user-images.githubusercontent.com/11323660/255576090-a2bfc1ae-d117-44b7-9f7b-262e41179ec1.png style="max-width: 700px; width: 100%" />
|
98 |
|
99 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
100 |
| --- | --- | --- | --- | --- |
|
|
|
107 |
| https://huggingface.co/EleutherAI/polyglot-ko-12.8b | 0.5954 | 0.6306 | 0.6098 | 0.6118 |
|
108 |
| Llama-2 Original 7B* | 0.415390 | 0.431382 | 0.421342 | 0.442003 |
|
109 |
| Llama-2-Ko-7b 20B (10k) | 0.451757 | 0.466751 | 0.472607 | 0.482776 |
|
110 |
+
| Llama-2-Ko-7b 40B (20k) | 0.456246 | 0.465665 | 0.469810 | 0.477374 |
|
111 |
|
112 |
*Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
|
113 |
|
114 |
### BoolQ (F1)
|
115 |
|
116 |
+
<img src=https://user-images.githubusercontent.com/11323660/255576343-5d847a6f-3b6a-41a7-af37-0f11940a5ea4.png style="max-width: 700px; width: 100%" />
|
117 |
|
118 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
119 |
| --- | --- | --- | --- | --- |
|
|
|
126 |
| https://huggingface.co/EleutherAI/polyglot-ko-12.8b | 0.4818 | 0.6041 | 0.6289 | 0.6448 |
|
127 |
| Llama-2 Original 7B* | 0.352050 | 0.563238 | 0.474788 | 0.419222 |
|
128 |
| Llama-2-Ko-7b 20B (10k) | 0.360656 | 0.679743 | 0.680109 | 0.662152 |
|
129 |
+
| Llama-2-Ko-7b 40B (20k) | 0.578640 | 0.697747 | 0.708358 | 0.714423 |
|
130 |
|
131 |
*Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
|
132 |
|
133 |
### SentiNeg (F1)
|
134 |
|
135 |
+
<img src=https://user-images.githubusercontent.com/11323660/255576572-b005a81d-fa4d-4709-b48a-f0fe4eed17a3.png style="max-width: 700px; width: 100%" />
|
136 |
|
137 |
| Model | 0-shot | 5-shot | 10-shot | 50-shot |
|
138 |
| --- | --- | --- | --- | --- |
|
|
|
145 |
| https://huggingface.co/EleutherAI/polyglot-ko-12.8b | 0.9117 | 0.9015 | 0.9345 | 0.9723 |
|
146 |
| Llama-2 Original 7B* | 0.347502 | 0.529124 | 0.480641 | 0.788457 |
|
147 |
| Llama-2-Ko-7b 20B (10k) | 0.485546 | 0.829503 | 0.871141 | 0.851253 |
|
148 |
+
| Llama-2-Ko-7b 40B (20k) | 0.459447 | 0.761079 | 0.727611 | 0.936988 |
|
149 |
*Llama-2 Original 7B used https://huggingface.co/meta-llama/Llama-2-7b-hf (No tokenizer updated)
|
150 |
|
151 |
|