yentinglin commited on
Commit
e7ea1a0
1 Parent(s): ac4e06b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -11
README.md CHANGED
@@ -29,16 +29,16 @@ pipeline_tag: text-generation
29
 
30
 
31
  ## Overview
32
- Taiwan-LLaMa is a full parameter fine-tuned model based on LLaMa 2 for traditional chinese applications.
33
 
34
- **Taiwan-LLaMa v1.0** pretrained on over 5 billion tokens and instruction-tuned on over 490k conversations both in traditional chinese.
35
 
36
  ## Demo
37
  A live demonstration of the model can be accessed at [Hugging Face Spaces](https://huggingface.co/spaces/yentinglin/Taiwan-LLaMa2).
38
 
39
  ## Key Features
40
 
41
- 1. **Traditional Chinese Support**: The model is fine-tuned to understand and generate text in Traditional Chinese, making it suitable for Taiwanese culture and related applications.
42
 
43
  2. **Instruction-Tuned**: Further fine-tuned on conversational data to offer context-aware and instruction-following responses.
44
 
@@ -48,8 +48,8 @@ A live demonstration of the model can be accessed at [Hugging Face Spaces](https
48
 
49
 
50
  ## Work in progress
51
- - [ ] **Improved Pretraining**: A refined version of the existing pretraining approach is under development, aiming to enhance model performance.
52
- - [ ] **Extended Model Length**: Utilizing the Rope mechanism, the model's length will be extended from 4k to 8k.
53
 
54
 
55
  ## Taiwanese Culture Examples
@@ -71,7 +71,7 @@ We provide a number of model checkpoints that we trained. Please find them on Hu
71
  |--------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------|
72
  | **Taiwan-LLaMa v1.0** (_better for Taiwanese Culture_) | 🤗 <a href="https://huggingface.co/yentinglin/Taiwan-LLaMa-v1.0" target="_blank">yentinglin/Taiwan-LLaMa-v1.0</a> |
73
  | Taiwan-LLaMa v0.9 (partial instruction set) | 🤗 <a href="https://huggingface.co/yentinglin/Taiwan-LLaMa-v0.9" target="_blank">yentinglin/Taiwan-LLaMa-v0.9</a> |
74
- | Taiwan-LLaMa v0.0 (no Traditional Chinese pretraining) | 🤗 <a href="https://huggingface.co/yentinglin/Taiwan-LLaMa-v0.0" target="_blank">yentinglin/Taiwan-LLaMa-v0.0</a> |
75
 
76
  ## Data
77
 
@@ -79,8 +79,8 @@ Here are some quick links to the datasets that we used to train the models:
79
 
80
  | **Dataset** | **Link** |
81
  |---------------------------------|-------------------------------------------------------------------------------------------------------------------------------|
82
- | **Instruction-tuning** | 🤗 <a href="https://huggingface.co/datasets/yentinglin/traditional_chinese_instructions" target="_blank">yentinglin/traditional_chinese_instructions</a> |
83
- | Traditional Chinese Pretraining | 🤗 <a href="https://huggingface.co/datasets/yentinglin/zh_TW_c4" target="_blank">yentinglin/zh_TW_c4</a> |
84
 
85
 
86
  ## Architecture
@@ -88,12 +88,12 @@ Taiwan-LLaMa is based on LLaMa 2, leveraging transformer architecture, <a href="
88
 
89
  It includes:
90
 
91
- * Pretraining Phase: Pretrained on a vast corpus of over 5 billion tokens, extracted from common crawl in Traditional Chinese.
92
  * Fine-tuning Phase: Further instruction-tuned on over 490k multi-turn conversational data to enable more instruction-following and context-aware responses.
93
 
94
  ## Generic Capabilities on Vicuna Benchmark
95
 
96
- The data is translated into traditional Chinese for evaluating the general capability.
97
 
98
 
99
  <img src="./images/zhtw_vicuna_bench_chatgptbaseline.png" width="700">
@@ -156,7 +156,7 @@ If you use our code, data, or models in your research, please cite this reposito
156
  ```
157
 
158
  ## Collaborate With Us
159
- If you are interested in contributing to the development of Traditional Chinese language models, exploring new applications, or leveraging Taiwan-LLaMa for your specific needs, please don't hesitate to contact us. We welcome collaborations from academia, industry, and individual contributors.
160
 
161
  ## License
162
  The code in this project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
 
29
 
30
 
31
  ## Overview
32
+ Taiwan-LLaMa is a full parameter fine-tuned model based on LLaMa 2 for Traditional Mandarin applications.
33
 
34
+ **Taiwan-LLaMa v1.0** pretrained on over 5 billion tokens and instruction-tuned on over 490k conversations both in traditional mandarin.
35
 
36
  ## Demo
37
  A live demonstration of the model can be accessed at [Hugging Face Spaces](https://huggingface.co/spaces/yentinglin/Taiwan-LLaMa2).
38
 
39
  ## Key Features
40
 
41
+ 1. **Traditional Mandarin Support**: The model is fine-tuned to understand and generate text in Traditional Mandarin, making it suitable for Taiwanese culture and related applications.
42
 
43
  2. **Instruction-Tuned**: Further fine-tuned on conversational data to offer context-aware and instruction-following responses.
44
 
 
48
 
49
 
50
  ## Work in progress
51
+ - [ ] **Improved pretraining**: A refined pretraining process (e.g. more data from Taiwan, training strategies) is under development, aiming to enhance model performance for better Taiwanese culture.
52
+ - [ ] **Extend max length**: Utilizing the Rope mechanism as described in [the paper](https://arxiv.org/abs/2104.09864), the model's length will be extended from 4k to 8k.
53
 
54
 
55
  ## Taiwanese Culture Examples
 
71
  |--------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------|
72
  | **Taiwan-LLaMa v1.0** (_better for Taiwanese Culture_) | 🤗 <a href="https://huggingface.co/yentinglin/Taiwan-LLaMa-v1.0" target="_blank">yentinglin/Taiwan-LLaMa-v1.0</a> |
73
  | Taiwan-LLaMa v0.9 (partial instruction set) | 🤗 <a href="https://huggingface.co/yentinglin/Taiwan-LLaMa-v0.9" target="_blank">yentinglin/Taiwan-LLaMa-v0.9</a> |
74
+ | Taiwan-LLaMa v0.0 (no Traditional Mandarin pretraining) | 🤗 <a href="https://huggingface.co/yentinglin/Taiwan-LLaMa-v0.0" target="_blank">yentinglin/Taiwan-LLaMa-v0.0</a> |
75
 
76
  ## Data
77
 
 
79
 
80
  | **Dataset** | **Link** |
81
  |---------------------------------|-------------------------------------------------------------------------------------------------------------------------------|
82
+ | **Instruction-tuning** | 🤗 <a href="https://huggingface.co/datasets/yentinglin/traditional_mandarin_instructions" target="_blank">yentinglin/traditional_mandarin_instructions</a> |
83
+ | Traditional Mandarin Pretraining | 🤗 <a href="https://huggingface.co/datasets/yentinglin/zh_TW_c4" target="_blank">yentinglin/zh_TW_c4</a> |
84
 
85
 
86
  ## Architecture
 
88
 
89
  It includes:
90
 
91
+ * Pretraining Phase: Pretrained on a vast corpus of over 5 billion tokens, extracted from common crawl in Traditional Mandarin.
92
  * Fine-tuning Phase: Further instruction-tuned on over 490k multi-turn conversational data to enable more instruction-following and context-aware responses.
93
 
94
  ## Generic Capabilities on Vicuna Benchmark
95
 
96
+ The data is translated into traditional mandarin for evaluating the general capability.
97
 
98
 
99
  <img src="./images/zhtw_vicuna_bench_chatgptbaseline.png" width="700">
 
156
  ```
157
 
158
  ## Collaborate With Us
159
+ If you are interested in contributing to the development of Traditional Mandarin language models, exploring new applications, or leveraging Taiwan-LLaMa for your specific needs, please don't hesitate to contact us. We welcome collaborations from academia, industry, and individual contributors.
160
 
161
  ## License
162
  The code in this project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.