sangmine commited on
Commit
0f05ab2
ยท
verified ยท
1 Parent(s): 587cd4f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -45
README.md CHANGED
@@ -13,33 +13,24 @@ tags:
13
  ---
14
 
15
  # Model Details
16
- Saltlux, AI Labs์—์„œ ํ•™์Šต ๋ฐ ๊ณต๊ฐœํ•œ <b>Llama-3-Luxia-Ko-8B</b> ๋ชจ๋ธ์€ Meta์—์„œ ์ถœ์‹œํ•œ Llama-3-8B ๋ชจ๋ธ์„ <b>ํ•œ๊ตญ์–ด์— ํŠนํ™”</b>ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.<br><br>
17
- ์ž์ฒด ๋ณด์œ ํ•˜๊ณ  ์žˆ๋Š” 1TB ์ด์ƒ์˜ ํ•œ๊ตญ์–ด ํ•™์Šต ๋ฐ์ดํ„ฐ ์ค‘, ์•ฝ 100GB ์ •๋„์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์„ ๋ณ„ํ•˜์—ฌ ์‚ฌ์ „ํ•™์Šต์„ ์ˆ˜ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค.<br><br>
18
  ๋˜ํ•œ ๊ณต๊ฐœ๋œ Llama-3 Tokenizer๋ฅผ ํ•œ๊ตญ์–ด๋กœ ํ™•์žฅํ•˜๊ณ  ์‚ฌ์ „ํ•™์Šต์— ํ™œ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
19
 
20
  - **Meta Llama-3:** Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety.
21
-
22
- ### Model Description
23
- - **Model developers:** Saltlux, AI Labs ์–ธ์–ด๋ชจ๋ธํŒ€
24
- - **Variation:** Llama-3-Luxia-Ko 8B ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜์ค€์˜ ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ
25
- - **Input:** ํ…์ŠคํŠธ๋งŒ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.
26
- - **Output:** ํ…์ŠคํŠธ์™€ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
27
- - **Model Architecture:** Llama-3-Luxia-Ko ๋ชจ๋ธ์€ Meta์—์„œ ์ถœ์‹œํ•œ Llama-3์™€ ๊ฐ™์€ auto-regressive ์–ธ์–ด๋ชจ๋ธ๋กœ ์ตœ์ ํ™”๋œ transformer ์•„ํ‚คํ…์ณ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
28
- - **Model Release Date:** April 30, 2024.
29
- - **Status:** ์ด ๋ชจ๋ธ์€ ์˜คํ”„๋ผ์ธ ๋ฐ์ดํ„ฐ ์„ธํŠธ์—์„œ ํ›ˆ๋ จ๋œ Staticํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ปค๋ฎค๋‹ˆํ‹ฐ์˜ ํ”ผ๋“œ๋ฐฑ์„ ํ†ตํ•ด ๋ชจ๋ธ ์•ˆ์ •์„ฑ์„ ๊ฐœ์„ ํ•จ์— ๋”ฐ๋ผ ์กฐ์ •๋œ ๋ชจ๋ธ์˜ ํ–ฅํ›„ ๋ฒ„์ „์ด ์ถœ์‹œ๋  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.
30
  - **License:** Llama3 License: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
31
 
32
  ### Intended Use
33
- Llama-3-Luxia-Ko๋Š” ํ•œ๊ตญ์–ด ํŠนํ™” ์–ธ์–ด๋ชจ๋ธ๋กœ ์—ฐ๊ตฌ์šฉ์œผ๋กœ ์ œ์ž‘๋˜์—ˆ์œผ๋ฉฐ, ๋‹ค์–‘ํ•œ ์ž์—ฐ์–ด ์ƒ์„ฑ ์ž‘์—…์— ๋งž๊ฒŒ ์žฌํ™œ์šฉ ๋ฐ ๋ณ€ํ˜•๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
34
 
35
  ### How to Use
36
- ์ด ์ €์žฅ์†Œ์—๋Š” transformers์™€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์ฝ”๋“œ๋ฒ ์ด์Šค์™€ `Llama-3-Luxia-Ko-8B`๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
37
 
38
  ```
39
  import transformers
40
  import torch
41
 
42
- model_id = "Saltlux/Llama-3-Luxia-Ko-8B"
43
 
44
  pipeline = transformers.pipeline(
45
  "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
@@ -48,38 +39,11 @@ pipeline("<|begin_of_text|>์•ˆ๋…•ํ•˜์„ธ์š”. ์†”ํŠธ๋ฃฉ์Šค AI Labs ์ž…๋‹ˆ๋‹ค.")
48
 
49
  ```
50
  # Training Details
51
- Llama-3-Luxia-Ko ๋ชจ๋ธ ํ•™์Šต์„ ์œ„ํ•ด ํ™œ์šฉํ•œ ํ•™์Šต ๋ฐ์ดํ„ฐ ๋ฐ ์žฅ๋น„๋Š” Saltlux์—์„œ ๋ณด์œ ํ•˜๊ณ  ์žˆ๋Š” ์ž์ฒด ํ•œ๊ตญ์–ด ์ฝ”ํผ์Šค ๋ฐ H100 ์ธ์Šคํ„ด์Šค๋ฅผ ํ™œ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
52
-
53
- ### Training Data
54
- Llama-3-Luxia-Ko๋Š” ๊ณต๊ฐœ์ ์œผ๋กœ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์ฝ”ํผ์Šค์™€ ํ•จ๊ป˜ ์ž์ฒด์ ์œผ๋กœ ์ˆ˜์ง‘ํ•œ 2023๋…„ ์ตœ์‹  ๋‰ด์Šค๋ฐ์ดํ„ฐ๋ฅผ ํฌํ•จํ•˜์—ฌ ์•ฝ 100GB ์ฝ”ํผ์Šค๋กœ ์‚ฌ์ „ํ•™์Šต ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.<br>
55
- ์‚ฌ์ „ํ•™์Šต ๋ฐ์ดํ„ฐ๋Š” ์ผ๋ฐ˜ ๋ถ„์•ผ ์ด์™ธ์—๋„ ๋ฒ•๋ฅ , ํŠนํ—ˆ, ์˜๋ฃŒ, ์—ญ์‚ฌ, ์‚ฌํšŒ, ๋ฌธํ™”, ๋Œ€ํ™”(๋ฌธ์–ด/๊ตฌ์–ด) ๋“ฑ ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ์ด ํฌํ•จ๋˜์–ด์žˆ์Šต๋‹ˆ๋‹ค.
56
-
57
- ### Data Preprocessing
58
- ๋ณด์œ ํ•˜๊ณ  ์žˆ๋Š” ํ•œ๊ตญ์–ด ๋ฐ์ดํ„ฐ์˜ ํ’ˆ์งˆ ํ–ฅ์ƒ์„ ์œ„ํ•ด ๋ฌธ์„œ ์‚ญ์ œ(Document Delete), ๋ฌธ์„œ ์ˆ˜์ •(Document Modify) ์ˆ˜์ค€์˜ ์ „์ฒ˜๋ฆฌ ๋ฐฉ์•ˆ์„ ์ˆ˜๋ฆฝํ•˜๊ณ  ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.
59
-
60
- + **Document Delete**
61
- - ์งง์€ ํ…์ŠคํŠธ (120 ์Œ์ ˆ ๋ฏธ๋งŒ) ํ•„ํ„ฐ๋ง
62
- - ๊ธด ํ…์ŠคํŠธ (100,000 ์Œ์ ˆ ์ด์ƒ) ํ•„ํ„ฐ๋ง
63
- - ํ•œ๊ตญ์–ด ๋น„์œจ์ด 25% ๋ฏธ๋งŒ์ธ ๊ฒฝ์šฐ ํ•„ํ„ฐ๋ง
64
- - ๊ธ€๋จธ๋ฆฌ ๊ธฐํ˜ธ๊ฐ€ 90% ์ด์ƒ์ธ ๊ฒฝ์šฐ ํ•„ํ„ฐ๋ง
65
- - ์š•์„ค์ด ์žˆ๋Š” ๊ฒฝ์šฐ ํ•„ํ„ฐ๋ง
66
-
67
- + **Document Modify**
68
- - ์ด๋ชจ์…˜ ๋ฌธ์ž ์ •๊ทœํ™” (์ตœ๋Œ€ 2๊ฐœ๊นŒ์ง€ ํ—ˆ์šฉ)
69
- - ๊ฐœํ–‰ ๋ฌธ์ž ์ •๊ทœํ™” (์ตœ๋Œ€ 2๊ฐœ๊นŒ์ง€ ํ—ˆ์šฉ)
70
- - HTML ํƒœ๊ทธ ์ œ๊ฑฐ
71
- - ๋ถˆํ•„์š”ํ•œ ๋ฌธ์ž ์ œ๊ฑฐ
72
- - ๋น„์‹๋ณ„ํ™” ์ง„ํ–‰ (ํœด๋Œ€ํฐ ๋ฒˆํ˜ธ, ๊ณ„์ขŒ๋ฒˆํ˜ธ ๋“ฑ์˜ ๊ฐœ์ธ์ •๋ณด)
73
- - ์ค‘๋ณต ๋ฌธ์ž์—ด ์ œ๊ฑฐ
74
-
75
- ### Data Sampling
76
- Llama-3-Luxia-Ko-8B ๋ชจ๋ธ ํ•™์Šต์„ ์œ„ํ•ด 1TB ์ˆ˜์ค€์˜ ํ•œ๊ตญ์–ด ์ฝ”ํผ์Šค์˜ 10๋ถ„์˜ 1์ธ 100GB ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ˜ํ”Œ๋งํ•ฉ๋‹ˆ๋‹ค.<br><br>๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ๋ง์€ ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ๊ณผ ๋‚ด์šฉ์ด ํฌํ•จ๋  ์ˆ˜ ์žˆ๋„๋ก ๊ณ ๋ คํ•˜์—ฌ ์ƒ˜ํ”Œ๋งํ•˜๋ฉฐ ๋ฐฉ๋ฒ•์€ ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.<br>
77
- + ์ƒ˜ํ”Œ๋ง ๋Œ€์ƒ์€ 10GB ์ด์ƒ์˜ ํฌ๊ธฐ๋ฅผ ๊ฐ€์ง€๋Š” ๋„๋ฉ”์ธ ์ฝ”ํผ์Šค
78
- + ๋„๋ฉ”์ธ ์ฝ”ํผ์Šค ๋‚ด ๋ช…์‚ฌ, ๋ณตํ•ฉ๋ช…์‚ฌ ๊ธฐ๋ฐ˜ ํ‚ค์›Œ๋“œ ์‚ฌ์ „ ๊ตฌ์ถ•
79
- + ๋“ฑ์žฅํ•˜๋Š” ํ‚ค์›Œ๋“œ์˜ DF(Document Frequency)๊ฐ€ ์ž„๊ณ„๊ฐ’ ์ด์ƒ์ผ ๊ฒฝ์šฐ ํ•ด๋‹น ํ‚ค์›Œ๋“œ๊ฐ€ ํฌํ•จ๋œ ๋ฌธ์„œ๋Š” ์ƒ˜ํ”Œ๋ง์„ ์ค‘๋‹จ
80
 
81
  ### Use Device
82
- NVIDIA H100 80GB * 8EA์„ ํ™œ์šฉํ•˜์—ฌ ๋ชจ๋ธ ์‚ฌ์ „ํ•™์Šต์„ ์ง„ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
83
 
84
  #### Training Hyperparameters
85
  <table>
@@ -100,7 +64,7 @@ NVIDIA H100 80GB * 8EA์„ ํ™œ์šฉํ•˜์—ฌ ๋ชจ๋ธ ์‚ฌ์ „ํ•™์Šต์„ ์ง„ํ–‰ํ•˜์˜€์Šต๋‹ˆ
100
  </td>
101
  </tr>
102
  <tr>
103
- <td>Llama-3-Luxia-Ko
104
  </td>
105
  <td>8B
106
  </td>
 
13
  ---
14
 
15
  # Model Details
16
+ Saltlux, AI Labs ์–ธ์–ด๋ชจ๋ธํŒ€์—์„œ ํ•™์Šต ๋ฐ ๊ณต๊ฐœํ•œ <b>Ko-Llama3-Luxia-8B</b> ๋ชจ๋ธ์€ Meta์—์„œ ์ถœ์‹œํ•œ Llama-3-8B ๋ชจ๋ธ์„ <b>ํ•œ๊ตญ์–ด์— ํŠนํ™”</b>ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.<br><br>
17
+ ์ž์ฒด ๋ณด์œ ํ•˜๊ณ  ์žˆ๋Š” 1TB ์ด์ƒ์˜ ํ•œ๊ตญ์–ด ํ•™์Šต ๋ฐ์ดํ„ฐ ์ค‘, ์•ฝ 100GB ์ •๋„์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์„ ๋ณ„ํ•˜์—ฌ ์‚ฌ์ „ํ•™์Šต์— ํ™œ์šฉํ•˜์˜€์Šต๋‹ˆ๋‹ค.<br><br>
18
  ๋˜ํ•œ ๊ณต๊ฐœ๋œ Llama-3 Tokenizer๋ฅผ ํ•œ๊ตญ์–ด๋กœ ํ™•์žฅํ•˜๊ณ  ์‚ฌ์ „ํ•™์Šต์— ํ™œ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
19
 
20
  - **Meta Llama-3:** Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety.
 
 
 
 
 
 
 
 
 
21
  - **License:** Llama3 License: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
22
 
23
  ### Intended Use
24
+ Ko-Llama3-Luxia-8B๋Š” ์—ฐ๊ตฌ์šฉ์œผ๋กœ ์ œ์ž‘๋˜์—ˆ์œผ๋ฉฐ, ๋‹ค์–‘ํ•œ ์ž์—ฐ์–ด ์ƒ์„ฑ ํƒœ์Šคํฌ๋ฅผ ์œ„ํ•ด ์ž์œ ๋กญ๊ฒŒ ํ•™์Šต ๋ฐ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
25
 
26
  ### How to Use
27
+ ํ•ด๋‹น ๋ชจ๋ธ ์นด๋“œ์—๋Š” `Ko-Llama3-Luxia-8B` ๋ชจ๋ธ๊ณผ transformers ๋ผ์ด๋ธŒ๋ฆฌ๋Ÿฌ ๊ธฐ๋ฐ˜์˜ ์˜ˆ์‹œ ์ฝ”๋“œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
28
 
29
  ```
30
  import transformers
31
  import torch
32
 
33
+ model_id = "saltlux/Ko-Llama3-Luxia-8B"
34
 
35
  pipeline = transformers.pipeline(
36
  "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
 
39
 
40
  ```
41
  # Training Details
42
+ Ko-Llama3-Luxia-8B ๋ชจ๋ธ ํ•™์Šต์„ ์œ„ํ•ด ํ™œ์šฉํ•œ ํ•™์Šต ๋ฐ์ดํ„ฐ ๋ฐ ์žฅ๋น„๋Š” Saltlux์—์„œ ๋ณด์œ ํ•˜๊ณ  ์žˆ๋Š” ์ž์ฒด ํ•œ๊ตญ์–ด ์ฝ”ํผ์Šค ๋ฐ H100 ์ธ์Šคํ„ด์Šค๋ฅผ ํ™œ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
43
+ ํ•œ๊ตญ์–ด ํŠนํ™”๋ฅผ ์œ„ํ•œ ์‚ฌ์ „ํ•™์Šต ๋ฐ์ดํ„ฐ๋Š” ์ž์ฒด ๋ณด์œ ํ•˜๊ณ  ์žˆ๋Š” 2023๋…„ ์ตœ์‹  ๋‰ด์Šค, ๋ฒ•๋ฅ , ํŠนํ—ˆ, ์˜๋ฃŒ, ์—ญ์‚ฌ, ์‚ฌํšŒ, ๋ฌธํ™”, ๋Œ€ํ™”(๋ฌธ์–ด/๊ตฌ์–ด) ๋“ฑ์˜ ๋„๋ฉ”์ธ์„ ํฌํ•จํ•œ 100GB ์ˆ˜์ค€์˜ ์ฝ”ํผ์Šค๋กœ ์‚ฌ์ „ํ•™์Šต ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.<br>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
 
45
  ### Use Device
46
+ NVIDIA H100 80GB * 8EA ์žฅ๋น„๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์‚ฌ์ „ํ•™์Šต์„ ์ง„ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
47
 
48
  #### Training Hyperparameters
49
  <table>
 
64
  </td>
65
  </tr>
66
  <tr>
67
+ <td>Ko-Llama3-Luxia-8B
68
  </td>
69
  <td>8B
70
  </td>