weiqipedia
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -10,8 +10,7 @@ license: llama3
|
|
10 |
# LLaMA3 8B SEA-LIONv2
|
11 |
|
12 |
SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
|
13 |
-
This model
|
14 |
-
This is the card for the LLaMA3 8B SEA-LIONv2 base model.
|
15 |
|
16 |
SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
|
17 |
|
@@ -20,11 +19,6 @@ SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
|
|
20 |
|
21 |
### Model Description
|
22 |
|
23 |
-
The LLaMA3 8B SEA-LIONv model is a significant leap forward in the field of Natural Language Processing,
|
24 |
-
specifically trained to understand the SEA regional context.
|
25 |
-
|
26 |
-
For tokenization, the model employs the default tokenizer used in Meta-Llama-3-8B-Instruct.
|
27 |
-
|
28 |
The continued pre-training data for LLaMA3 8B SEA-LIONv2 base model encompasses approximately 48B tokens.
|
29 |
|
30 |
- **Developed by:** Products Pillar, AI Singapore
|
@@ -33,11 +27,13 @@ The continued pre-training data for LLaMA3 8B SEA-LIONv2 base model encompasses
|
|
33 |
- **Languages:** English, Indonesian, Thai, Vietnamese, Tamil
|
34 |
- **License:** [LLaMA3 Community License](https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/LICENSE)
|
35 |
|
|
|
|
|
36 |
### Benchmark Performance
|
37 |
We evaluated LLaMA3 8B SEA-LIONv2 base model on general language capabilities.
|
38 |
|
39 |
#### General Language Capabilities
|
40 |
-
For the evaluation of general language capabilities, we employed the [BHASA evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
|
41 |
These tasks include Question Answering (QA), Sentiment Analysis (Sentiment), Toxicity Detection (Toxicity), Translation in both directions (Eng>Lang & Lang>Eng), Abstractive Summarization (Summ), Causal Reasoning (Causal) and Natural Language Inference (NLI).
|
42 |
|
43 |
The evaluation was done **five-shot** with native prompts and only a sample of 100-1000 instances for each dataset was used as per the setting described in the paper.
|
@@ -46,6 +42,8 @@ The evaluation was done **five-shot** with native prompts and only a sample of 1
|
|
46 |
|
47 |
To be released soon
|
48 |
|
|
|
|
|
49 |
**English**
|
50 |
|
51 |
| Model | ARC | BBH | HellaSwag | MMLU | GSM8k | Average |
|
@@ -85,7 +83,7 @@ LLaMA3 8B SEA-LIONv2 base model was continued pre-trained on 48B tokens of the f
|
|
85 |
Note:
|
86 |
- All token counts are counted using LLaMA3 tokenizer
|
87 |
- wiki* sources includes Wikipedia, Wiki Books, Wiki Source and Wiki Voyage
|
88 |
-
-
|
89 |
|
90 |
### Infrastructure
|
91 |
|
|
|
10 |
# LLaMA3 8B SEA-LIONv2
|
11 |
|
12 |
SEA-LION is a collection of Large Language Models (LLMs) which has been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
|
13 |
+
This is the card for the LLaMA3 8B SEA-LIONv2 base model which has undergone continued pre-training from the [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model.
|
|
|
14 |
|
15 |
SEA-LION stands for <i>Southeast Asian Languages In One Network</i>.
|
16 |
|
|
|
19 |
|
20 |
### Model Description
|
21 |
|
|
|
|
|
|
|
|
|
|
|
22 |
The continued pre-training data for LLaMA3 8B SEA-LIONv2 base model encompasses approximately 48B tokens.
|
23 |
|
24 |
- **Developed by:** Products Pillar, AI Singapore
|
|
|
27 |
- **Languages:** English, Indonesian, Thai, Vietnamese, Tamil
|
28 |
- **License:** [LLaMA3 Community License](https://huggingface.co/meta-llama/Meta-Llama-3-8B/blob/main/LICENSE)
|
29 |
|
30 |
+
For tokenization, the model employs the default tokenizer used in Meta-Llama-3-8B-Instruct.
|
31 |
+
|
32 |
### Benchmark Performance
|
33 |
We evaluated LLaMA3 8B SEA-LIONv2 base model on general language capabilities.
|
34 |
|
35 |
#### General Language Capabilities
|
36 |
+
For the evaluation of general language capabilities in SEA languages, we employed the [BHASA evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
|
37 |
These tasks include Question Answering (QA), Sentiment Analysis (Sentiment), Toxicity Detection (Toxicity), Translation in both directions (Eng>Lang & Lang>Eng), Abstractive Summarization (Summ), Causal Reasoning (Causal) and Natural Language Inference (NLI).
|
38 |
|
39 |
The evaluation was done **five-shot** with native prompts and only a sample of 100-1000 instances for each dataset was used as per the setting described in the paper.
|
|
|
42 |
|
43 |
To be released soon
|
44 |
|
45 |
+
We also evaluated the model on English capabilities using tasks from the Open LLM Leaderboard.
|
46 |
+
|
47 |
**English**
|
48 |
|
49 |
| Model | ARC | BBH | HellaSwag | MMLU | GSM8k | Average |
|
|
|
83 |
Note:
|
84 |
- All token counts are counted using LLaMA3 tokenizer
|
85 |
- wiki* sources includes Wikipedia, Wiki Books, Wiki Source and Wiki Voyage
|
86 |
+
- Tamil news is sourced with permission from [Seithi](https://seithi.mediacorp.sg/)
|
87 |
|
88 |
### Infrastructure
|
89 |
|