allganize
/

Llama-3-Alpha-Ko-8B-Instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

kuotient commited on May 24, 2024

Commit

db00ff4

·

verified ·

1 Parent(s): b357355

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ Alpha-Instruct has achieved outstanding performance on the **LogicKor, scoring a
 ---
 ## Overview
-Alpha-Instruct is our latest language model, developed using 'Evolutionary Model Merging' technique. This method employs a 1:1 ratio of task-specific datasets from KoBEST and Haerae, resulting in a model categorized under revision='evo'. The following models were used for merging:
 - [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) (Base)
 - [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) (Instruct)
 - [Llama-3-Open-Ko-8B](beomi/Llama-3-Open-Ko-8B) (Continual Pretrained)
@@ -36,7 +36,7 @@ Results in [LogicKor](https://github.com/StableFluffy/LogicKor)* are as follows:
 |:------------------------------:|:------------:|:-----------:|:--------:|
 | MLP-KTLim/llama-3-Korean-Bllossom-8B | 4.238 | 3.404 | 3.821 |
 | Alpha-Ko-Evo | 5.143 | 5.238 | 5.190 |
-| Alpha-Ko-Instruct (alt)  |     7.095    |    6.571    |   **6.833**  |
 | Alpha-Ko-Instruct |     **7.143**    |    6.065    |   6.620  |
 | Alpha-Ko-Instruct-marlin (4bit) | 6.857 | 5.738 | 6.298 |
@@ -44,7 +44,7 @@ Results in [LogicKor](https://github.com/StableFluffy/LogicKor)* are as follows:
 Result in KoBEST(acc, num_shot=5) are as follows:
-| Task  |	beomi/Llama-3-Open-Ko-8B-Instruct | maywell/Llama-3-Ko-8B-Instruct | Alpha-Ko-Evo | Alpha-Ko-Instruct |
 | --- | --- | --- | --- | --- |
 | kobest overall |	0.6220 | 0.6852 |0.7229|0.7055
 | kobest_boolq|	0.6254 |	0.7208 | 0.8547 | 0.8369

 ---
 ## Overview
+Alpha-Instruct is our latest language model, developed using 'Evolutionary Model Merging' technique. This method employs a 1:1 ratio of task-specific datasets from KoBEST and Haerae, resulting in a model with named 'Alpha-Ko-8B-Evo'. The following models were used for merging:
 - [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) (Base)
 - [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) (Instruct)
 - [Llama-3-Open-Ko-8B](beomi/Llama-3-Open-Ko-8B) (Continual Pretrained)
 |:------------------------------:|:------------:|:-----------:|:--------:|
 | MLP-KTLim/llama-3-Korean-Bllossom-8B | 4.238 | 3.404 | 3.821 |
 | Alpha-Ko-Evo | 5.143 | 5.238 | 5.190 |
+| Alpha-Ko-Instruct (alt)  |     7.095    |    **6.571**    |   **6.833**  |
 | Alpha-Ko-Instruct |     **7.143**    |    6.065    |   6.620  |
 | Alpha-Ko-Instruct-marlin (4bit) | 6.857 | 5.738 | 6.298 |
 Result in KoBEST(acc, num_shot=5) are as follows:
+| Task  |	beomi/Llama-3-Open-Ko-8B-Instruct | maywell/Llama-3-Ko-8B-Instruct | **Alpha-Ko-Evo** | **Alpha-Ko-Instruct** |
 | --- | --- | --- | --- | --- |
 | kobest overall |	0.6220 | 0.6852 |0.7229|0.7055
 | kobest_boolq|	0.6254 |	0.7208 | 0.8547 | 0.8369