kkmkorea commited on
Commit
dd070bf
โ€ข
1 Parent(s): 46263fb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -39
README.md CHANGED
@@ -9,9 +9,9 @@ metrics:
9
 
10
  <!-- Provide a quick summary of what the model is/does. -->
11
 
12
- KorSciDeBERTa๋Š” Microsoft DeBERTa ๋ชจ๋ธ์˜ ์•„ํ‚คํ…์ณ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ, ๋…ผ๋ฌธ, NTIS ์—ฐ๊ตฌ๊ณผ์ œ, ํŠนํ—ˆ, ๋‰ด์Šค, ํ•œ๊ตญ์–ด ์œ„ํ‚ค ์ฝ”ํผ์Šค ์ด 146GB๋ฅผ ์‚ฌ์ „ํ•™์Šตํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
13
 
14
- ๋งˆ์Šคํ‚น๋œ ์–ธ์–ด ๋ชจ๋ธ๋ง ๋˜๋Š” ๋‹ค์Œ ๋ฌธ์žฅ ์˜ˆ์ธก์— ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ณ , ๋˜ํ•œ ๋ฌธ์žฅ ๋ถ„๋ฅ˜, ๋‹จ์–ด ํ† ํฐ ๋ถ„๋ฅ˜ ๋˜๋Š” ์งˆ์˜์‘๋‹ต๊ณผ ๊ฐ™์€ ๋‹ค์šด์ŠคํŠธ๋ฆผ ์ž‘์—…์—์„œ ๋ฏธ์„ธ ์กฐ์ •์„ ํ†ตํ•ด ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
15
 
16
  ## Model Details
17
 
@@ -79,31 +79,19 @@ trainer.push_to_hub()
79
 
80
  ํŠน์ • ๊ด€์ ์— ๋Œ€ํ•œ ๊ณผ๋Œ€/๊ณผ์†Œ ํ‘œํ˜„, ๊ณ ์ • ๊ด€๋…, ๊ฐœ์ธ ์ •๋ณด, ์ฆ์˜ค/๋ชจ์š• ๋˜๋Š” ํญ๋ ฅ์ ์ธ ์–ธ์–ด, ์ฐจ๋ณ„์ ์ด๊ฑฐ๋‚˜ ํŽธ๊ฒฌ์ ์ธ ์–ธ์–ด, ๊ด€๋ จ์ด ์—†๊ฑฐ๋‚˜ ๋ฐ˜๋ณต์ ์ธ ์ถœ๋ ฅ ์ƒ์„ฑ ๋“ฑ.
81
 
82
- ### Recommendations
83
-
84
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
85
-
86
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
87
-
88
- ## How to Get Started with the Model
89
-
90
- Use the code below to get started with the model.
91
-
92
- [More Information Needed]
93
 
94
  ## Training Details
95
 
96
  ### Training Data
97
 
98
  <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
99
-
100
- [More Information Needed]
101
 
102
  ### Training Procedure
103
 
104
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
105
 
106
- #### Preprocessing [optional]
107
 
108
  - ๊ณผํ•™๊ธฐ์ˆ ๋ถ„์•ผ ํ† ํฌ๋‚˜์ด์ € (KorSci Tokenizer)
109
  - ๋ณธ ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์—์„œ ์‚ฌ์šฉ๋œ ์ฝ”ํผ์Šค๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ช…์‚ฌ ๋ฐ ๋ณตํ•ฉ๋ช…์‚ฌ ์•ฝ 600๋งŒ๊ฐœ์˜ ์‚ฌ์šฉ์ž์‚ฌ์ „์ด ์ถ”๊ฐ€๋œ [Mecab-ko Tokenizer](https://bitbucket.org/eunjeon/mecab-ko/src/master/)์™€ ๊ธฐ์กด SentencePiece-BPE๊ฐ€ ๋ณ‘ํ•ฉ๋˜์–ด์ง„ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ง๋ญ‰์น˜๋ฅผ ์ „์ฒ˜๋ฆฌํ•˜์˜€์Šต๋‹ˆ๋‹ค.
@@ -121,11 +109,6 @@ Use the code below to get started with the model.
121
  - **vocab_size:** 128,100
122
  - **Training regime:** fp16 mixed precision <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
123
 
124
- #### Speeds, Sizes, Times [optional]
125
-
126
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
127
-
128
- [More Information Needed]
129
 
130
  ## Evaluation
131
 
@@ -175,23 +158,6 @@ Python 3.9, Cuda 11.8, PyTorch 1.10
175
  <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
176
  ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ์ •๋ณด์—ฐ๊ตฌ์› (2023) : ํ•œ๊ตญ์–ด ๊ณผํ•™๊ธฐ์ˆ ๋ถ„์•ผ DeBERTa ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ (KorSciDeBERTa). Version 1.0. ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ์ •๋ณด์—ฐ๊ตฌ์›.
177
 
178
- **BibTeX:**
179
-
180
- [More Information Needed]
181
-
182
- **APA:**
183
-
184
- [More Information Needed]
185
-
186
- ## Glossary [optional]
187
-
188
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
189
-
190
- [More Information Needed]
191
-
192
- ## More Information [optional]
193
-
194
- [More Information Needed]
195
 
196
  ## Model Card Authors
197
 
 
9
 
10
  <!-- Provide a quick summary of what the model is/does. -->
11
 
12
+ KorSciDeBERTa๋Š” Microsoft DeBERTa ๋ชจ๋ธ์˜ ์•„ํ‚คํ…์ณ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ, ๋…ผ๋ฌธ, NTIS ์—ฐ๊ตฌ๊ณผ์ œ, ํŠนํ—ˆ, ๋‰ด์Šค, ํ•œ๊ตญ์–ด ์œ„ํ‚ค ๋ง๋ญ‰์น˜ ์ด 146GB๋ฅผ ์‚ฌ์ „ํ•™์Šตํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
13
 
14
+ ๋งˆ์Šคํ‚น๋œ ์–ธ์–ด ๋ชจ๋ธ๋ง ๋˜๋Š” ๋‹ค์Œ ๋ฌธ์žฅ ์˜ˆ์ธก์— ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ณ , ์ถ”๊ฐ€๋กœ ๋ฌธ์žฅ ๋ถ„๋ฅ˜, ๋‹จ์–ด ํ† ํฐ ๋ถ„๋ฅ˜ ๋˜๋Š” ์งˆ์˜์‘๋‹ต๊ณผ ๊ฐ™์€ ๋‹ค์šด์ŠคํŠธ๋ฆผ ์ž‘์—…์—์„œ ๋ฏธ์„ธ ์กฐ์ •์„ ํ†ตํ•ด ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
15
 
16
  ## Model Details
17
 
 
79
 
80
  ํŠน์ • ๊ด€์ ์— ๋Œ€ํ•œ ๊ณผ๋Œ€/๊ณผ์†Œ ํ‘œํ˜„, ๊ณ ์ • ๊ด€๋…, ๊ฐœ์ธ ์ •๋ณด, ์ฆ์˜ค/๋ชจ์š• ๋˜๋Š” ํญ๋ ฅ์ ์ธ ์–ธ์–ด, ์ฐจ๋ณ„์ ์ด๊ฑฐ๋‚˜ ํŽธ๊ฒฌ์ ์ธ ์–ธ์–ด, ๊ด€๋ จ์ด ์—†๊ฑฐ๋‚˜ ๋ฐ˜๋ณต์ ์ธ ์ถœ๋ ฅ ์ƒ์„ฑ ๋“ฑ.
81
 
 
 
 
 
 
 
 
 
 
 
 
82
 
83
  ## Training Details
84
 
85
  ### Training Data
86
 
87
  <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
88
+ ๋…ผ๋ฌธ, NTIS ์—ฐ๊ตฌ๊ณผ์ œ, ํŠนํ—ˆ, ๋‰ด์Šค, ํ•œ๊ตญ์–ด ์œ„ํ‚ค ๋ง๋ญ‰์น˜ ์ด 146GB
 
89
 
90
  ### Training Procedure
91
 
92
+ KISTI HPC NVIDIA A100 80G GPU 24EA์—์„œ 2.5๊ฐœ์›”๋™์•ˆ 1,600,000 ์Šคํ… ํ•™์Šต
93
 
94
+ #### Preprocessing
95
 
96
  - ๊ณผํ•™๊ธฐ์ˆ ๋ถ„์•ผ ํ† ํฌ๋‚˜์ด์ € (KorSci Tokenizer)
97
  - ๋ณธ ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์—์„œ ์‚ฌ์šฉ๋œ ์ฝ”ํผ์Šค๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ช…์‚ฌ ๋ฐ ๋ณตํ•ฉ๋ช…์‚ฌ ์•ฝ 600๋งŒ๊ฐœ์˜ ์‚ฌ์šฉ์ž์‚ฌ์ „์ด ์ถ”๊ฐ€๋œ [Mecab-ko Tokenizer](https://bitbucket.org/eunjeon/mecab-ko/src/master/)์™€ ๊ธฐ์กด SentencePiece-BPE๊ฐ€ ๋ณ‘ํ•ฉ๋˜์–ด์ง„ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ง๋ญ‰์น˜๋ฅผ ์ „์ฒ˜๋ฆฌํ•˜์˜€์Šต๋‹ˆ๋‹ค.
 
109
  - **vocab_size:** 128,100
110
  - **Training regime:** fp16 mixed precision <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
111
 
 
 
 
 
 
112
 
113
  ## Evaluation
114
 
 
158
  <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
159
  ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ์ •๋ณด์—ฐ๊ตฌ์› (2023) : ํ•œ๊ตญ์–ด ๊ณผํ•™๊ธฐ์ˆ ๋ถ„์•ผ DeBERTa ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ (KorSciDeBERTa). Version 1.0. ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ์ •๋ณด์—ฐ๊ตฌ์›.
160
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
161
 
162
  ## Model Card Authors
163