Update README.md
#6
by
SuperkingbasSKB
- opened
README.md
CHANGED
@@ -18,17 +18,17 @@ tags:
|
|
18 |
- medical
|
19 |
- text-generation-inference
|
20 |
---
|
21 |
-
# OpenThaiLLM
|
22 |
-
**OpenThaiLLM-
|
23 |
-
It demonstrates
|
24 |
-
constrained generation, and reasoning tasks.is a
|
25 |
## Introduction
|
26 |
|
27 |
-
Qwen2
|
28 |
|
29 |
Compared with the state-of-the-art opensource language models, including the previous released Qwen1.5, Qwen2 has generally surpassed most opensource models and demonstrated competitiveness against proprietary models across a series of benchmarks targeting for language understanding, language generation, multilingual capability, coding, mathematics, reasoning, etc.
|
30 |
|
31 |
-
Qwen2
|
32 |
|
33 |
For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2/), [GitHub](https://github.com/QwenLM/Qwen2), and [Documentation](https://qwen.readthedocs.io/en/latest/).
|
34 |
<br>
|
@@ -83,13 +83,13 @@ print(response)
|
|
83 |
```
|
84 |
|
85 |
## Evaluation Performance Few-shot (5 shot)
|
86 |
-
| Model | ONET | IC | TGAT | TPAT-1 | A-Level | Average ThaiExam) |
|
87 |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|
88 |
-
|
|
|
|
|
|
|
|
89 |
| SeaLLM-v3-7B | 0.4753 | 0.6421 | 0.6153 | 0.3275 | 0.3464 | 0.4813 | 0.7037 | 0.4907 | 0.4625 |
|
90 |
-
| llama-3-typhoon-v1.5-8b | 0.3765 | 0.3473 | 0.5538 | 0.4137 | 0.2913 | 0.3965 | 0.4312 | 0.6451 |
|
91 |
-
| OpenThaiGPT-1.0.0-7B | 0.3086 | 0.3052 | 0.4153 | 0.3017 | 0.2755 | 0.3213 | 0.255 | 0.3512 |
|
92 |
-
| Meta-Llama-3.1-8B | 0.3641 | 0.2631 | 0.2769 | 0.3793 | 0.1811 | 0.2929 | 0.4239 | 0.6591 |
|
93 |
|
94 |
## Evaluation Performance Few-shot (2 shot)
|
95 |
|
@@ -102,4 +102,3 @@ If you find our work helpful, feel free to give us a cite.
|
|
102 |
title={Qwen2 Technical Report},
|
103 |
year={2024}
|
104 |
}
|
105 |
-
```
|
|
|
18 |
- medical
|
19 |
- text-generation-inference
|
20 |
---
|
21 |
+
# OpenThaiLLM-: Thai & China Large Language Model (Instruct)
|
22 |
+
**OpenThaiLLM-DoodNiLT-Instruct** is an 7 billion parameter instruct model designed for Thai πΉπ & China π¨π³ language.
|
23 |
+
It demonstrates competitive performance with GPT-3.5-turbo and llama-3-typhoon-v1.5-8b-instruct, and is optimized for application use cases, Retrieval-Augmented Generation (RAG),
|
24 |
+
constrained generation, and reasoning tasks.is a Thai πΉπ & China π¨π³ large language model with 7 billion parameters, and it is based on Qwen2-7B.
|
25 |
## Introduction
|
26 |
|
27 |
+
Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 7B Qwen2 model.
|
28 |
|
29 |
Compared with the state-of-the-art opensource language models, including the previous released Qwen1.5, Qwen2 has generally surpassed most opensource models and demonstrated competitiveness against proprietary models across a series of benchmarks targeting for language understanding, language generation, multilingual capability, coding, mathematics, reasoning, etc.
|
30 |
|
31 |
+
Qwen2-7B-Instruct supports a context length of up to 131,072 tokens, enabling the processing of extensive inputs. Please refer to [this section](#processing-long-texts) for detailed instructions on how to deploy Qwen2 for handling long texts.
|
32 |
|
33 |
For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2/), [GitHub](https://github.com/QwenLM/Qwen2), and [Documentation](https://qwen.readthedocs.io/en/latest/).
|
34 |
<br>
|
|
|
83 |
```
|
84 |
|
85 |
## Evaluation Performance Few-shot (5 shot)
|
86 |
+
| Model | ONET | IC | TGAT | TPAT-1 | A-Level | Average (ThaiExam) | M3Exam (1 shot) | MMLU |
|
87 |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|
88 |
+
| DoodNiLT-7B | **0.5185** | **0.6421** | **0.6461** | **0.4224** | **0.3937** | **0.5245** | **0.5355** | 0.6644 |
|
89 |
+
| llama-3-typhoon-v1.5-8b | 0.3765 | 0.3473 | 0.5538 | 0.4137 | 0.2913 | 0.3965 | 0.6451 | 0.4312 | 0.4125 |
|
90 |
+
| OpenThaiGPT-1.0.0-7B | 0.3086 | 0.3052 | 0.4153 | 0.3017 | 0.2755 | 0.3213 | 0.3512 | 0.255 | 0.3105 |
|
91 |
+
| Meta-Llama-3.1-8B | 0.3641 | 0.2631 | 0.2769 | 0.3793 | 0.1811 | 0.2929 | 0.6591 | 0.4239 | 0.3583 |
|
92 |
| SeaLLM-v3-7B | 0.4753 | 0.6421 | 0.6153 | 0.3275 | 0.3464 | 0.4813 | 0.7037 | 0.4907 | 0.4625 |
|
|
|
|
|
|
|
93 |
|
94 |
## Evaluation Performance Few-shot (2 shot)
|
95 |
|
|
|
102 |
title={Qwen2 Technical Report},
|
103 |
year={2024}
|
104 |
}
|
|