sangmine commited on
Commit
f48cbb6
ยท
verified ยท
1 Parent(s): b709ab8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -63
README.md CHANGED
@@ -11,18 +11,17 @@ tags:
11
  - llama-3
12
  - pytorch
13
  ---
14
- # Llama-3-Luxia-Ko-8B
15
- **Built with Meta Llama 3**<br>
16
- Meta์—์„œ ์ถœ์‹œํ•œ Llama-3 ๋ชจ๋ธ vocab 128,256๊ฐœ์— ํ•œ๊ตญ์–ด vocab 17,536๊ฐœ๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ์ด 145,792๊ฐœ์˜ vocab์„ ํ™•๋ณดํ•˜์˜€์Šต๋‹ˆ๋‹ค.<br>
17
- ์ดํ›„ ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ์˜ ํ•œ๊ตญ์–ด ์ฝ”ํผ์Šค ์•ฝ 95GB๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ํ•™์Šต๋œ ํ•œ๊ตญ์–ด ํŠนํ™” ์‚ฌ์ „ํ•™์Šต ์–ธ์–ด๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
18
 
19
- ## Model Details
20
- - **Overview:** ์ด ๋ชจ๋ธ์€ Llama-3๋ชจ๋ธ์— ํ•œ๊ตญ์–ด vocab 17,536๊ฐœ๋ฅผ ์ถ”๊ฐ€๋กœ ํ•™์Šตํ•œ ๋’ค, ํ•œ๊ตญ์–ด ์ฝ”ํผ์Šค๋กœ ์‚ฌ์ „ํ•™์Šต๋œ ํ•œ๊ตญ์–ด ํŠนํ™” ์–ธ์–ด๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
 
 
 
21
  - **Meta Llama-3:** Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety.
22
 
23
  ### Model Description
24
- - **Developed by:** Saltlux AIlabs ์–ธ์–ด๋ชจ๋ธํŒ€
25
- - **Vatiations:** Llama-3-Luxia-Ko 8B ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜์ค€์˜ ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ
26
  - **Input:** ํ…์ŠคํŠธ๋งŒ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.
27
  - **Output:** ํ…์ŠคํŠธ์™€ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
28
  - **Model Architecture:** Llama-3-Luxia-Ko ๋ชจ๋ธ์€ Meta์—์„œ ์ถœ์‹œํ•œ Llama-3์™€ ๊ฐ™์€ auto-regressive ์–ธ์–ด๋ชจ๋ธ๋กœ ์ตœ์ ํ™”๋œ transformer ์•„ํ‚คํ…์ณ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
@@ -30,9 +29,9 @@ Meta์—์„œ ์ถœ์‹œํ•œ Llama-3 ๋ชจ๋ธ vocab 128,256๊ฐœ์— ํ•œ๊ตญ์–ด vocab 17,536
30
  - **Status:** ์ด ๋ชจ๋ธ์€ ์˜คํ”„๋ผ์ธ ๋ฐ์ดํ„ฐ ์„ธํŠธ์—์„œ ํ›ˆ๋ จ๋œ Staticํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ปค๋ฎค๋‹ˆํ‹ฐ์˜ ํ”ผ๋“œ๋ฐฑ์„ ํ†ตํ•ด ๋ชจ๋ธ ์•ˆ์ •์„ฑ์„ ๊ฐœ์„ ํ•จ์— ๋”ฐ๋ผ ์กฐ์ •๋œ ๋ชจ๋ธ์˜ ํ–ฅํ›„ ๋ฒ„์ „์ด ์ถœ์‹œ๋  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.
31
  - **License:** Llama3 License: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
32
 
33
- ## Intended Use
34
- - **Intended Use Cases:** Llama-3-Luxia-Ko๋Š” ํ•œ๊ตญ์–ด ํŠนํ™” ์–ธ์–ด๋ชจ๋ธ๋กœ ์ƒ์—…์šฉ ๋ฐ ์—ฐ๊ตฌ์šฉ์œผ๋กœ ์ œ์ž‘๋˜์—ˆ์œผ๋ฉฐ ์‚ฌ์ „ํ•™์Šต ์–ธ์–ด๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
35
-
36
  ### How to Use
37
  ์ด ์ €์žฅ์†Œ์—๋Š” transformers์™€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์ฝ”๋“œ๋ฒ ์ด์Šค์™€ `Llama-3-Luxia-Ko-8B`๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
38
 
@@ -45,58 +44,54 @@ model_id = "Saltlux/Llama-3-Luxia-Ko-8B"
45
  pipeline = transformers.pipeline(
46
  "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
47
  )
48
- pipeline("<|begin_of_text|>์•ˆ๋…•ํ•˜์„ธ์š”~!")
49
 
50
  ```
51
-
52
- ## Training Details
53
 
54
  ### Training Data
55
- - **Overview:** Llama-3-Luxia-Ko๋Š” ๊ณต๊ฐœ์ ์œผ๋กœ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์ฝ”ํผ์Šค์™€ ํ•จ๊ป˜ ์ž์ฒด์ ์œผ๋กœ ์ˆ˜์ง‘ํ•œ 2023๋…„ ์ตœ์‹  ๋‰ด์Šค๋ฐ์ดํ„ฐ๋ฅผ ํฌํ•จํ•˜์—ฌ ์•ฝ 95GB ์ฝ”ํผ์Šค๋กœ ์‚ฌ์ „ํ•™์Šต ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.<br>
56
- ์‚ฌ์ „ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ ๋„๋ฉ”์ธ์€ ๋ฒ•๋ฅ , ํŠนํ—ˆ, ์˜๋ฃŒ, ์—ญ์‚ฌ ๋ถ„์•ผ ๋“ฑ์˜ ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ์ด ํฌํ•จ๋˜์–ด์žˆ์Šต๋‹ˆ๋‹ค.
57
-
58
- #### Preprocessing
59
- ํ•œ๊ตญ์–ด ๊ณต๊ฐœ ์ฝ”ํผ์Šค ๋ฐ ์ž์ฒด ์ˆ˜์ง‘ ๋ฐ์ดํ„ฐ ์•ฝ 1TB ์ˆ˜์ค€์—์„œ Saltlux๊ฐ€ ์ž์ฒด ์ œ์ž‘ํ•œ ์ •๊ทœํ™” ํˆด์„ ํ™œ์šฉํ•˜์—ฌ ์ „์ฒ˜๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
60
-
61
- [Document Delete]
62
- - ์งง์€ ํ…์ŠคํŠธ (120 ์Œ์ ˆ ๋ฏธ๋งŒ) ํ•„ํ„ฐ๋ง
63
- - ๊ธด ํ…์ŠคํŠธ (100,000 ์Œ์ ˆ ์ด์ƒ) ํ•„ํ„ฐ๋ง
64
- - ํ•œ๊ตญ์–ด ๋น„์œจ์ด 25% ๋ฏธ๋งŒ์ธ ๊ฒฝ์šฐ ํ•„ํ„ฐ๋ง
65
- - ๊ธ€๋จธ๋ฆฌ ๊ธฐํ˜ธ๊ฐ€ 90% ์ด์ƒ์ธ ๊ฒฝ์šฐ ํ•„ํ„ฐ๋ง
66
- - ์š•์„ค์ด ์žˆ๋Š” ๊ฒฝ์šฐ ํ•„ํ„ฐ๋ง
67
-
68
- [Document Modify]
69
- - ์ด๋ชจ์…˜ ๋ฌธ์ž ์ •๊ทœํ™” (์ตœ๋Œ€ 2๊ฐœ๊นŒ์ง€ ํ—ˆ์šฉ)
70
- - ๊ฐœํ–‰ ๋ฌธ์ž ์ •๊ทœํ™” (์ตœ๋Œ€ 2๊ฐœ๊นŒ์ง€ ํ—ˆ์šฉ)
71
- - HTML ํƒœ๊ทธ ์ œ๊ฑฐ
72
- - ๋ถˆํ•„์š”ํ•œ ๋ฌธ์ž ์ œ๊ฑฐ
73
- - ๋น„์‹๋ณ„ํ™” ์ง„ํ–‰ (ํœด๋Œ€ํฐ ๋ฒˆํ˜ธ, ๊ณ„์ขŒ๋ฒˆํ˜ธ ๋“ฑ์˜ ๊ฐœ์ธ์ •๋ณด)
74
- - ์ค‘๋ณต ๋ฌธ์ž์—ด ์ œ๊ฑฐ
75
-
76
- #### Random Sampling
77
- ์ „์ฒด ํ™•๋ณดํ•œ ์ฝ”ํผ์Šค ์ค‘ ์šฐ์„ ์ ์œผ๋กœ ๋ฐ˜๋“œ์‹œ ํ•™์Šตํ•ด์•ผํ•˜๋Š” ์ฝ”ํผ์Šค์ด๋ฉด์„œ, ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ์—์„œ ์ƒ˜ํ”Œ๋ง ํ•˜๊ธฐ ์œ„ํ•ด ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.<br>
78
- Saltlux์˜ ์ƒ˜ํ”Œ๋ง ๋ฐฉ๋ฒ•์€ ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.
79
- - ์ฝ”ํผ์Šค์˜ ์šฉ๋Ÿ‰์ด 10GB ์ด์ƒ์ธ ๋ฐ์ดํ„ฐ์— ํ•œํ•ด์„œ ๋žœ๋ค ์ƒ˜ํ”Œ๋ง ์ง„ํ–‰
80
- - ์ƒ˜ํ”Œ๋ง ๋ฐฉ์•ˆ์€ ์ž…๋ ฅ ์ฝ”ํผ์Šค์—์„œ ๋ช…์‚ฌ/๋ณตํ•ฉ๋ช…์‚ฌ๋ฅผ ์ถ”์ถœํ•ด ๋ฌธ์„œ ๋‚ด์˜ ๋ช…์‚ฌ ์ถ”์ถœ ๋นˆ๋„ ์ˆ˜๋ฅผ ์„ธ๋ฉฐ, ์ถ”์ถœ ๋นˆ๋„์˜ ์ž„๊ณ„๊ฐ’์„ ์ •ํ•ด ๋„˜์œผ๋ฉด ํ•ด๋‹น ๋ช…์‚ฌ/๋ณตํ•ฉ๋ช…์‚ฌ๊ฐ€ ์žˆ๋Š” ๋ฌธ์„œ๋ฅผ ์ƒ˜ํ”Œ๋งํ•˜์ง€ ์•Š์Œ
81
- - ํ•™์Šต๋ฐ์ดํ„ฐ ์ถ”์ถœ ๋ช…์‚ฌ๋นˆ๋„ ์ž„๊ณ„๊ฐ’์€ 1,000์œผ๋กœ ํ•ด์„œ ๋žœ๋ค ์ƒ˜ํ”Œ๋ง ์ง„ํ–‰ํ•˜์—ฌ ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ, ๋ฌธ์„œ๋ฅผ ํ•™์Šต๋ฐ์ดํ„ฐ๋กœ ์„ ์ •
82
-
83
-
84
- ### Hardware and Hyperparameters
85
- - **Overview:** Saltlux-Ko-Llama-3 ํ•™์Šต์— ํ™œ์šฉํ•œ ์žฅ๋น„์™€ ํ•™์Šต ํŒŒ๋ผ๋ฏธํ„ฐ์ž…๋‹ˆ๋‹ค.
86
-
87
- #### Use Device
88
- NVIDIA H100 80GB * 8GA์„ ํ™œ์šฉํ•˜์—ฌ ๋ชจ๋ธ ์‚ฌ์ „ํ•™์Šต์„ ์ง„ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
89
 
90
  #### Training Hyperparameters
91
- |Model|Params|Context length|GQA|Learning rate|Batch|Precision|Epoch|
92
- |-------------|---|---|---|---|---|---|---|
93
- |Saltlux-Ko-Llama-3|8B|8k|Yes|5e-6|128|bf16|1.0|
94
 
95
  ### Tokenizer
96
- - **Overview:** ํ•œ๊ตญ์–ด ํ† ํฐ๋“ค์„ ๋‹ค ์ปค๋ฒ„ํ•  ์ˆ˜ ์žˆ๋Š” ์ˆ˜์ค€์˜ ๊ณต๊ฐœ ๋ง๋ญ‰์น˜ ๋ฐ์ดํ„ฐ 87.85GB ๋กœ Llama-3 ํ•œ๊ตญ์–ด ํ† ํฌ๋‚˜์ด์ € ํ•™์Šต์„ ์ง„ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
97
-
98
- #### Tokenizer Train Dataset
99
- ํ•œ๊ตญ์–ด ํ† ํฌ๋‚˜์ด์ € ํ•™์Šต์— ์‚ฌ์šฉํ•œ ๋ฐ์ดํ„ฐ๋Š” ๋‰ด์Šค, ๋ธ”๋กœ๊ทธ, ํ•œ๊ตญ์–ด ์œ„ํ‚ค๋ฐฑ๊ณผ, ๋Œ€ํ™”, ์ „๋ฌธ ๋„๋ฉ”์ธ(๋ฒ•๋ฅ , ํŠนํ—ˆ ๋“ฑ) ๊ณต๊ฐœ๋œ ๋‹ค์–‘ํ•œ ํ•œ๊ตญ์–ด ์ฝ”ํผ์Šค๋ฅผ ํ™œ์šฉํ•˜์˜€์Œ.
 
100
 
101
  #### Tokenizer Result
102
  <table>
@@ -150,17 +145,11 @@ NVIDIA H100 80GB * 8GA์„ ํ™œ์šฉํ•˜์—ฌ ๋ชจ๋ธ ์‚ฌ์ „ํ•™์Šต์„ ์ง„ํ–‰ํ•˜์˜€์Šต๋‹ˆ
150
  </tr>
151
  </table>
152
 
153
- ## Model Card Authors
154
- Saltlux AILabs ์–ธ์–ด๋ชจ๋ธํŒ€
155
-
156
- ## Model Card Contact
157
- Saltlux AILabs ์–ธ์–ด๋ชจ๋ธํŒ€
158
-
159
- ## Citation instructions
160
  **Llama-3-Luxia-Ko**
161
  ```
162
  @article{llama3luxiakomodelcard,
163
- title={Satlux Llama 3 Luxua Ko Model Card},
164
  author={AILabs@Saltux},
165
  year={2024},
166
  url={์ˆ˜์ •์˜ˆ์ •}
 
11
  - llama-3
12
  - pytorch
13
  ---
 
 
 
 
14
 
15
+ # Model Details
16
+ Saltlux, AI Labs์—์„œ ํ•™์Šต ๋ฐ ๊ณต๊ฐœํ•œ <b>Llama-3-Luxia-Ko-8B</b> ๋ชจ๋ธ์€ Meta์—์„œ ์ถœ์‹œํ•œ Llama-3-8B ๋ชจ๋ธ์„ <b>ํ•œ๊ตญ์–ด์— ํŠนํ™”</b>ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.<br><br>
17
+ ์ž์ฒด ๋ณด์œ ํ•˜๊ณ  ์žˆ๋Š” 1TB ์ด์ƒ์˜ ํ•œ๊ตญ์–ด ํ•™์Šต ๋ฐ์ดํ„ฐ ์ค‘, ์•ฝ 100GB ์ •๋„์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์„ ๋ณ„ํ•˜์—ฌ ์‚ฌ์ „ํ•™์Šต์„ ์ˆ˜ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค.<br><br>
18
+ ๋˜ํ•œ ๊ณต๊ฐœ๋œ Llama-3 Tokenizer๋ฅผ ํ•œ๊ตญ์–ด๋กœ ํ™•์žฅํ•˜๊ณ  ์‚ฌ์ „ํ•™์Šต์— ํ™œ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
19
+
20
  - **Meta Llama-3:** Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety.
21
 
22
  ### Model Description
23
+ - **Model developers:** Saltlux, AI Labs ์–ธ์–ด๋ชจ๋ธํŒ€
24
+ - **Variation:** Llama-3-Luxia-Ko 8B ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜์ค€์˜ ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ
25
  - **Input:** ํ…์ŠคํŠธ๋งŒ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.
26
  - **Output:** ํ…์ŠคํŠธ์™€ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
27
  - **Model Architecture:** Llama-3-Luxia-Ko ๋ชจ๋ธ์€ Meta์—์„œ ์ถœ์‹œํ•œ Llama-3์™€ ๊ฐ™์€ auto-regressive ์–ธ์–ด๋ชจ๋ธ๋กœ ์ตœ์ ํ™”๋œ transformer ์•„ํ‚คํ…์ณ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
 
29
  - **Status:** ์ด ๋ชจ๋ธ์€ ์˜คํ”„๋ผ์ธ ๋ฐ์ดํ„ฐ ์„ธํŠธ์—์„œ ํ›ˆ๋ จ๋œ Staticํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ปค๋ฎค๋‹ˆํ‹ฐ์˜ ํ”ผ๋“œ๋ฐฑ์„ ํ†ตํ•ด ๋ชจ๋ธ ์•ˆ์ •์„ฑ์„ ๊ฐœ์„ ํ•จ์— ๋”ฐ๋ผ ์กฐ์ •๋œ ๋ชจ๋ธ์˜ ํ–ฅํ›„ ๋ฒ„์ „์ด ์ถœ์‹œ๋  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.
30
  - **License:** Llama3 License: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
31
 
32
+ ### Intended Use
33
+ Llama-3-Luxia-Ko๋Š” ํ•œ๊ตญ์–ด ํŠนํ™” ์–ธ์–ด๋ชจ๋ธ๋กœ ์—ฐ๊ตฌ์šฉ์œผ๋กœ ์ œ์ž‘๋˜์—ˆ์œผ๋ฉฐ, ๋‹ค์–‘ํ•œ ์ž์—ฐ์–ด ์ƒ์„ฑ ์ž‘์—…์— ๋งž๊ฒŒ ์žฌํ™œ์šฉ ๋ฐ ๋ณ€ํ˜•๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
34
+
35
  ### How to Use
36
  ์ด ์ €์žฅ์†Œ์—๋Š” transformers์™€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์ฝ”๋“œ๋ฒ ์ด์Šค์™€ `Llama-3-Luxia-Ko-8B`๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
37
 
 
44
  pipeline = transformers.pipeline(
45
  "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
46
  )
47
+ pipeline("<|begin_of_text|>์•ˆ๋…•ํ•˜์„ธ์š”. ์†”ํŠธ๋ฃฉ์Šค AI Labs ์ž…๋‹ˆ๋‹ค.")
48
 
49
  ```
50
+ # Training Details
51
+ Llama-3-Luxia-Ko ๋ชจ๋ธ ํ•™์Šต์„ ์œ„ํ•ด ํ™œ์šฉํ•œ ํ•™์Šต ๋ฐ์ดํ„ฐ ๋ฐ ์žฅ๋น„๋Š” Saltlux์—์„œ ๋ณด์œ ํ•˜๊ณ  ์žˆ๋Š” ์ž์ฒด ํ•œ๊ตญ์–ด ์ฝ”ํผ์Šค ๋ฐ H100 ์ธ์Šคํ„ด์Šค๋ฅผ ํ™œ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
52
 
53
  ### Training Data
54
+ Llama-3-Luxia-Ko๋Š” ๊ณต๊ฐœ์ ์œผ๋กœ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์ฝ”ํผ์Šค์™€ ํ•จ๊ป˜ ์ž์ฒด์ ์œผ๋กœ ์ˆ˜์ง‘ํ•œ 2023๋…„ ์ตœ์‹  ๋‰ด์Šค๋ฐ์ดํ„ฐ๋ฅผ ํฌํ•จํ•˜์—ฌ ์•ฝ 100GB ์ฝ”ํผ์Šค๋กœ ์‚ฌ์ „ํ•™์Šต ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.<br>
55
+ ์‚ฌ์ „ํ•™์Šต ๋ฐ์ดํ„ฐ๋Š” ์ผ๋ฐ˜ ๋ถ„์•ผ ์ด์™ธ์—๋„ ๋ฒ•๋ฅ , ํŠนํ—ˆ, ์˜๋ฃŒ, ์—ญ์‚ฌ, ์‚ฌํšŒ, ๋ฌธํ™”, ๋Œ€ํ™”(๋ฌธ์–ด/๊ตฌ์–ด) ๋“ฑ ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ์ด ํฌํ•จ๋˜์–ด์žˆ์Šต๋‹ˆ๋‹ค.
56
+
57
+ ### Data Preprocessing
58
+ ๋ณด์œ ํ•˜๊ณ  ์žˆ๋Š” ํ•œ๊ตญ์–ด ๋ฐ์ดํ„ฐ์˜ ํ’ˆ์งˆ ํ–ฅ์ƒ์„ ์œ„ํ•ด ๋ฌธ์„œ ์‚ญ์ œ(Document Delete), ๋ฌธ์„œ ์ˆ˜์ •(Document Modify) ์ˆ˜์ค€์˜ ์ „์ฒ˜๋ฆฌ ๋ฐฉ์•ˆ์„ ์ˆ˜๋ฆฝํ•˜๊ณ  ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.
59
+
60
+ + **Document Delete**
61
+ - ์งง์€ ํ…์ŠคํŠธ (120 ์Œ์ ˆ ๋ฏธ๋งŒ) ํ•„ํ„ฐ๋ง
62
+ - ๊ธด ํ…์ŠคํŠธ (100,000 ์Œ์ ˆ ์ด์ƒ) ํ•„ํ„ฐ๋ง
63
+ - ํ•œ๊ตญ์–ด ๋น„์œจ์ด 25% ๋ฏธ๋งŒ์ธ ๊ฒฝ์šฐ ํ•„ํ„ฐ๋ง
64
+ - ๊ธ€๋จธ๋ฆฌ ๊ธฐํ˜ธ๊ฐ€ 90% ์ด์ƒ์ธ ๊ฒฝ์šฐ ํ•„ํ„ฐ๋ง
65
+ - ์š•์„ค์ด ์žˆ๋Š” ๊ฒฝ์šฐ ํ•„ํ„ฐ๋ง
66
+
67
+ + **Document Modify**
68
+ - ์ด๋ชจ์…˜ ๋ฌธ์ž ์ •๊ทœํ™” (์ตœ๋Œ€ 2๊ฐœ๊นŒ์ง€ ํ—ˆ์šฉ)
69
+ - ๊ฐœํ–‰ ๋ฌธ์ž ์ •๊ทœํ™” (์ตœ๋Œ€ 2๊ฐœ๊นŒ์ง€ ํ—ˆ์šฉ)
70
+ - HTML ํƒœ๊ทธ ์ œ๊ฑฐ
71
+ - ๋ถˆํ•„์š”ํ•œ ๋ฌธ์ž ์ œ๊ฑฐ
72
+ - ๋น„์‹๋ณ„ํ™” ์ง„ํ–‰ (ํœด๋Œ€ํฐ ๋ฒˆํ˜ธ, ๊ณ„์ขŒ๋ฒˆํ˜ธ ๋“ฑ์˜ ๊ฐœ์ธ์ •๋ณด)
73
+ - ์ค‘๋ณต ๋ฌธ์ž์—ด ์ œ๊ฑฐ
74
+
75
+ ### Data Sampling
76
+ Llama-3-Luxia-Ko-8B ๋ชจ๋ธ ํ•™์Šต์„ ์œ„ํ•ด 1TB ์ˆ˜์ค€์˜ ํ•œ๊ตญ์–ด ์ฝ”ํผ์Šค์˜ 10๋ถ„์˜ 1์ธ 100GB ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ˜ํ”Œ๋งํ•ฉ๋‹ˆ๋‹ค.<br><br>๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ๋ง์€ ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ๊ณผ ๋‚ด์šฉ์ด ํฌํ•จ๋  ์ˆ˜ ์žˆ๋„๋ก ๊ณ ๋ คํ•˜์—ฌ ์ƒ˜ํ”Œ๋งํ•˜๋ฉฐ ๋ฐฉ๋ฒ•์€ ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.<br>
77
+ + ์ƒ˜ํ”Œ๋ง ๋Œ€์ƒ์€ 10GB ์ด์ƒ์˜ ํฌ๊ธฐ๋ฅผ ๊ฐ€์ง€๋Š” ๋„๋ฉ”์ธ ์ฝ”ํผ์Šค
78
+ + ๋„๋ฉ”์ธ ์ฝ”ํผ์Šค ๋‚ด ๋ช…์‚ฌ, ๋ณตํ•ฉ๋ช…์‚ฌ ๊ธฐ๋ฐ˜ ํ‚ค์›Œ๋“œ ์‚ฌ์ „ ๊ตฌ์ถ•
79
+ + ๋“ฑ์žฅํ•˜๋Š” ํ‚ค์›Œ๋“œ์˜ DF(Document Frequency)๊ฐ€ ์ž„๊ณ„๊ฐ’ ์ด์ƒ์ผ ๊ฒฝ์šฐ ํ•ด๋‹น ํ‚ค์›Œ๋“œ๊ฐ€ ํฌํ•จ๋œ ๋ฌธ์„œ๋Š” ์ƒ˜ํ”Œ๋ง์„ ์ค‘๋‹จ
80
+
81
+ ### Use Device
82
+ NVIDIA H100 80GB * 8EA์„ ํ™œ์šฉํ•˜์—ฌ ๋ชจ๋ธ ์‚ฌ์ „ํ•™์Šต์„ ์ง„ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
 
 
 
 
 
83
 
84
  #### Training Hyperparameters
85
+ |Model|Params|Context length|GQA|Learning rate|Batch|Precision|
86
+ |---|---|---|---|---|---|---|
87
+ |Llama-3-Luxia-Ko|8B|8k|Yes|1e-5|128|bf16|
88
 
89
  ### Tokenizer
90
+ Llama-3-Tokenizer๋ฅผ ํ•œ๊ตญ์–ด ํŠนํ™”ํ•˜๊ธฐ ์œ„ํ•ด ํ•œ๊ตญ์–ด ํ† ํฐ 17,536๊ฐœ๋ฅผ ์ถ”๊ฐ€ํ•˜๊ณ  ํ™œ์šฉํ•˜์˜€์Šต๋‹ˆ๋‹ค.
91
+ |Model|Vocab Size|
92
+ |---|---|
93
+ |Llama-3|128,256|
94
+ |Llama-3-Luxia-Ko|145,792|
95
 
96
  #### Tokenizer Result
97
  <table>
 
145
  </tr>
146
  </table>
147
 
148
+ ### Citation instructions
 
 
 
 
 
 
149
  **Llama-3-Luxia-Ko**
150
  ```
151
  @article{llama3luxiakomodelcard,
152
+ title={Llama 3 Luxua Ko Model Card},
153
  author={AILabs@Saltux},
154
  year={2024},
155
  url={์ˆ˜์ •์˜ˆ์ •}