kkmkorea commited on
Commit
7dc1eb9
โ€ข
1 Parent(s): 3b57260

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -45
README.md CHANGED
@@ -5,11 +5,11 @@ language:
5
  metrics:
6
  - accuracy
7
  ---
8
- # Model Card for Model ID
9
 
10
  <!-- Provide a quick summary of what the model is/does. -->
11
 
12
- This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
13
 
14
  ## Model Details
15
 
@@ -19,20 +19,16 @@ This modelcard aims to be a base template for new models. It has been generated
19
 
20
 
21
 
22
- - **Developed by:** [More Information Needed]
23
- - **Shared by [optional]:** [More Information Needed]
24
- - **Model type:** [More Information Needed]
25
- - **Language(s) (NLP):** [More Information Needed]
26
- - **License:** [More Information Needed]
27
- - **Finetuned from model [optional]:** [More Information Needed]
28
 
29
- ### Model Sources [optional]
30
 
31
  <!-- Provide the basic links for the model. -->
32
 
33
- - **Repository:** [More Information Needed]
34
- - **Paper [optional]:** [More Information Needed]
35
- - **Demo [optional]:** [More Information Needed]
36
 
37
  ## Uses
38
 
@@ -53,14 +49,17 @@ This modelcard aims to be a base template for new models. It has been generated
53
  ### Out-of-Scope Use
54
 
55
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
56
-
57
- [More Information Needed]
 
 
58
 
59
  ## Bias, Risks, and Limitations
60
 
61
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
62
-
63
- [More Information Needed]
 
64
 
65
  ### Recommendations
66
 
@@ -88,12 +87,21 @@ Use the code below to get started with the model.
88
 
89
  #### Preprocessing [optional]
90
 
91
- [More Information Needed]
92
-
 
 
 
93
 
94
  #### Training Hyperparameters
95
 
96
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
 
 
 
 
 
97
 
98
  #### Speeds, Sizes, Times [optional]
99
 
@@ -111,7 +119,8 @@ Use the code below to get started with the model.
111
 
112
  <!-- This should link to a Data Card if possible. -->
113
 
114
- [More Information Needed]
 
115
 
116
  #### Factors
117
 
@@ -133,25 +142,8 @@ Use the code below to get started with the model.
133
 
134
 
135
 
136
- ## Model Examination [optional]
137
-
138
- <!-- Relevant interpretability work for the model goes here -->
139
-
140
- [More Information Needed]
141
 
142
- ## Environmental Impact
143
-
144
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
145
-
146
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
147
-
148
- - **Hardware Type:** [More Information Needed]
149
- - **Hours used:** [More Information Needed]
150
- - **Cloud Provider:** [More Information Needed]
151
- - **Compute Region:** [More Information Needed]
152
- - **Carbon Emitted:** [More Information Needed]
153
-
154
- ## Technical Specifications [optional]
155
 
156
  ### Model Architecture and Objective
157
 
@@ -159,19 +151,20 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
159
 
160
  ### Compute Infrastructure
161
 
162
- [More Information Needed]
163
 
164
  #### Hardware
165
 
166
- [More Information Needed]
167
 
168
  #### Software
169
 
170
- [More Information Needed]
171
 
172
- ## Citation [optional]
173
 
174
  <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 
175
 
176
  **BibTeX:**
177
 
@@ -191,12 +184,12 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
191
 
192
  [More Information Needed]
193
 
194
- ## Model Card Authors [optional]
195
 
196
- [More Information Needed]
197
 
198
  ## Model Card Contact
199
 
200
- [More Information Needed]
201
 
202
 
 
5
  metrics:
6
  - accuracy
7
  ---
8
+ # Model Card for KorSciDeBERTa
9
 
10
  <!-- Provide a quick summary of what the model is/does. -->
11
 
12
+ KorSciDeBERTa๋Š” Microsoft DeBERTa ๋ชจ๋ธ์˜ ์•„ํ‚คํ…์ณ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ, ๋…ผ๋ฌธ, NTIS ์—ฐ๊ตฌ๊ณผ์ œ, ํŠนํ—ˆ, ๋‰ด์Šค, ํ•œ๊ตญ์–ด ์œ„ํ‚ค ์ฝ”ํผ์Šค ์ด 146GB๋ฅผ ์‚ฌ์ „ํ•™์Šตํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ๋งˆ์Šคํ‚น๋œ ์–ธ์–ด ๋ชจ๋ธ๋ง ๋˜๋Š” ๋‹ค์Œ ๋ฌธ์žฅ ์˜ˆ์ธก์— ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ณ , ๋˜ํ•œ ๋ฌธ์žฅ ๋ถ„๋ฅ˜, ๋‹จ์–ด ํ† ํฐ ๋ถ„๋ฅ˜ ๋˜๋Š” ์งˆ์˜์‘๋‹ต๊ณผ ๊ฐ™์€ ๋‹ค์šด์ŠคํŠธ๋ฆผ ์ž‘์—…์—์„œ ๋ฏธ์„ธ ์กฐ์ •์„ ํ†ตํ•ด ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
13
 
14
  ## Model Details
15
 
 
19
 
20
 
21
 
22
+ - **Developed by:** KISTI
23
+ - **Model type:** deberta-v2
24
+ - **Language(s) (NLP):** ํ•œ๊ธ€(ko)
 
 
 
25
 
26
+ ### Model Sources
27
 
28
  <!-- Provide the basic links for the model. -->
29
 
30
+ - **Repository 1:** https://huggingface.co/kisti/korscideberta
31
+ - **Repository 2:** https://aida.kisti.re.kr/
 
32
 
33
  ## Uses
34
 
 
49
  ### Out-of-Scope Use
50
 
51
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
52
+ ์ด ๋ชจ๋ธ์€ ์˜๋„์ ์œผ๋กœ ์‚ฌ๋žŒ๋“ค์—๊ฒŒ ์ ๋Œ€์ ์ด๋‚˜ ์†Œ์™ธ๋œ ํ™˜๊ฒฝ์„ ์กฐ์„ฑํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋˜์–ด์„œ๋Š” ์•ˆ ๋ฉ๋‹ˆ๋‹ค.
53
+ ์ด ๋ชจ๋ธ์€ '๊ณ ์œ„ํ—˜ ์„ค์ •'์—์„œ ์‚ฌ์šฉ๋  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ ์‚ฌ๋žŒ์ด๋‚˜ ์‚ฌ๋ฌผ์— ๋Œ€ํ•œ ์ค‘์š”ํ•œ ๊ฒฐ์ •์„ ๋‚ด๋ฆด ์ˆ˜ ์žˆ๊ฒŒ ์„ค๊ณ„๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ์˜ ์ถœ๋ ฅ๋ฌผ์€ ์‚ฌ์‹ค์ด ์•„๋‹ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
54
+ '๊ณ ์œ„ํ—˜ ์„ค์ •'์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์‚ฌํ•ญ์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค:
55
+ ์˜๋ฃŒ/์ •์น˜/๋ฒ•๋ฅ /๊ธˆ์œต ๋ถ„์•ผ์—์„œ์˜ ์‚ฌ์šฉ, ๊ณ ์šฉ/๊ต์œก/์‹ ์šฉ ๋ถ„์•ผ์—์„œ์˜ ์ธ๋ฌผ ํ‰๊ฐ€, ์ž๋™์œผ๋กœ ์ค‘์š”ํ•œ ๊ฒƒ์„ ๊ฒฐ์ •ํ•˜๊ธฐ, (๊ฐ€์งœ)์‚ฌ์‹ค์„ ์ƒ์„ฑํ•˜๊ธฐ, ์‹ ๋ขฐ๋„ ๋†’์€ ์š”์•ฝ๋ฌธ ์ƒ์„ฑ, ํ•ญ์ƒ ์˜ณ์•„์•ผ๋งŒ ํ•˜๋Š” ์˜ˆ์ธก ์ƒ์„ฑ ๋“ฑ.
56
 
57
  ## Bias, Risks, and Limitations
58
 
59
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
60
+ ์—ฐ๊ตฌ๋ชฉ์ ์œผ๋กœ ์ €์ž‘๊ถŒ ๋ฌธ์ œ๊ฐ€ ์—†๋Š” ๋ง๋ญ‰์น˜ ๋ฐ์ดํ„ฐ๋งŒ์„ ์‚ฌ์šฉํ•˜์˜€์Šต๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์˜ ์‚ฌ์šฉ์ž๋Š” ์•„๋ž˜์˜ ์œ„ํ—˜ ์š”์ธ๋“ค์„ ์ธ์‹ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
61
+ ์‚ฌ์šฉ๋œ ๋ง๋ญ‰์น˜๋Š” ๋Œ€๋ถ€๋ถ„ ์ค‘๋ฆฝ์ ์ธ ์„ฑ๊ฒฉ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š”๋ฐ๋„ ๋ถˆ๊ตฌํ•˜๊ณ , ์–ธ์–ด ๋ชจ๋ธ์˜ ํŠน์„ฑ์ƒ ์•„๋ž˜์™€ ๊ฐ™์€ ์œค๋ฆฌ ๊ด€๋ จ ์š”์†Œ๋ฅผ ์ผ๋ถ€ ํฌํ•จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:
62
+ ํŠน์ • ๊ด€์ ์— ๋Œ€ํ•œ ๊ณผ๋Œ€/๊ณผ์†Œ ํ‘œํ˜„, ๊ณ ์ • ๊ด€๋…, ๊ฐœ์ธ ์ •๋ณด, ์ฆ์˜ค/๋ชจ์š• ๋˜๋Š” ํญ๋ ฅ์ ์ธ ์–ธ์–ด, ์ฐจ๋ณ„์ ์ด๊ฑฐ๋‚˜ ํŽธ๊ฒฌ์ ์ธ ์–ธ์–ด, ๊ด€๋ จ์ด ์—†๊ฑฐ๋‚˜ ๋ฐ˜๋ณต์ ์ธ ์ถœ๋ ฅ ์ƒ์„ฑ ๋“ฑ.
63
 
64
  ### Recommendations
65
 
 
87
 
88
  #### Preprocessing [optional]
89
 
90
+ - ๊ณผํ•™๊ธฐ์ˆ ๋ถ„์•ผ ํ† ํฌ๋‚˜์ด์ € (KorSci Tokenizer)
91
+ - ๋ณธ ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์—์„œ ์‚ฌ์šฉ๋œ ์ฝ”ํผ์Šค๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ช…์‚ฌ ๋ฐ ๋ณตํ•ฉ๋ช…์‚ฌ ์•ฝ 600๋งŒ๊ฐœ์˜ ์‚ฌ์šฉ์ž์‚ฌ์ „์ด ์ถ”๊ฐ€๋œ [Mecab-ko Tokenizer](https://bitbucket.org/eunjeon/mecab-ko/src/master/)์™€ ๊ธฐ์กด SentencePiece-BPE๊ฐ€ ๋ณ‘ํ•ฉ๋˜์–ด์ง„ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ง๋ญ‰์น˜๋ฅผ ์ „์ฒ˜๋ฆฌํ•˜์˜€์Šต๋‹ˆ๋‹ค.
92
+ - Total 128,100 words
93
+ - Included special tokens ( <unk>, <cls>, <s>, <mask> )
94
+ - File name : spm.model, vocab.txt
95
 
96
  #### Training Hyperparameters
97
 
98
+ - **model_size:** base
99
+ - **num_train_steps:** 1,600,000
100
+ - **train_batch_size:** 4,096 * 4 accumulative update = 16,384
101
+ - **learning_rate:** 1e-4
102
+ - **max_seq_length:** 512
103
+ - **vocab_size:** 128,100
104
+ - **Training regime:** fp16 mixed precision <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
105
 
106
  #### Speeds, Sizes, Times [optional]
107
 
 
119
 
120
  <!-- This should link to a Data Card if possible. -->
121
 
122
+ ๋ณธ ์–ธ์–ด๋ชจ๋ธ์˜ ์„ฑ๋Šฅํ‰๊ฐ€๋Š” ์—ฐ๊ตฌ๊ณผ์ œ๋ณด๊ณ ์„œ ๊ณผํ•™๊ธฐ์ˆ ํ‘œ์ค€๋ถ„๋ฅ˜ ํƒœ์Šคํฌ์— ํŒŒ์ธํŠœ๋‹ํ•˜์—ฌ ํ‰๊ฐ€ํ•˜๋Š” ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜์˜€์œผ๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ๋Š” ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.
123
+ - ์—ฐ๊ตฌ๊ณผ์ œ๋ณด๊ณ ์„œ ๊ณผํ•™๊ธฐ์ˆ ํ‘œ์ค€๋ถ„๋ฅ˜ ํ‰๊ฐ€ ๋ฐ์ดํ„ฐ์…‹(doi.org/10.23057/50), 145 Classes, 209,454 Training Set, 89,767 Test Set
124
 
125
  #### Factors
126
 
 
142
 
143
 
144
 
 
 
 
 
 
145
 
146
+ ## Technical Specifications
 
 
 
 
 
 
 
 
 
 
 
 
147
 
148
  ### Model Architecture and Objective
149
 
 
151
 
152
  ### Compute Infrastructure
153
 
154
+ KISTI ๊ตญ๊ฐ€์Šˆํผ์ปดํ“จํŒ…์„ผํ„ฐ NEURON ์‹œ์Šคํ…œ. HPE ClusterStor E1000, Lustre, Slurm
155
 
156
  #### Hardware
157
 
158
+ NVIDIA A100 80G GPU 24EA
159
 
160
  #### Software
161
 
162
+ Python 3.9, Cuda 11.8, PyTorch 1.10
163
 
164
+ ## Citation
165
 
166
  <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
167
+ ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ์ •๋ณด์—ฐ๊ตฌ์› (2023) : ํ•œ๊ตญ์–ด ๊ณผํ•™๊ธฐ์ˆ ๋ถ„์•ผ DeBERTa ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ (KorSciDeBERTa). Version 1.0. ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ์ •๋ณด์—ฐ๊ตฌ์›.
168
 
169
  **BibTeX:**
170
 
 
184
 
185
  [More Information Needed]
186
 
187
+ ## Model Card Authors
188
 
189
+ ๊น€๊ฒฝ๋ฏผ, ๊น€์€ํฌ, ๊น€์„ฑ์ฐฌ. ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ์ •๋ณด์—ฐ๊ตฌ์› ์ธ๊ณต์ง€๋Šฅ๋ฐ์ดํ„ฐ์—ฐ๊ตฌ๋‹จ
190
 
191
  ## Model Card Contact
192
 
193
+ ๊น€๊ฒฝ๋ฏผ, [email protected]
194
 
195