korscideberta / README.md
kkmkorea's picture
Update README.md
b78d70a
|
raw
history blame
7.25 kB
metadata
license: mit
language:
  - ko
metrics:
  - accuracy

Model Card for KorSciDeBERTa

KorSciDeBERTa๋Š” Microsoft DeBERTa ๋ชจ๋ธ์˜ ์•„ํ‚คํ…์ณ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ, ๋…ผ๋ฌธ, NTIS ์—ฐ๊ตฌ๊ณผ์ œ, ํŠนํ—ˆ, ๋‰ด์Šค, ํ•œ๊ตญ์–ด ์œ„ํ‚ค ์ฝ”ํผ์Šค ์ด 146GB๋ฅผ ์‚ฌ์ „ํ•™์Šตํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ๋งˆ์Šคํ‚น๋œ ์–ธ์–ด ๋ชจ๋ธ๋ง ๋˜๋Š” ๋‹ค์Œ ๋ฌธ์žฅ ์˜ˆ์ธก์— ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ณ , ๋˜ํ•œ ๋ฌธ์žฅ ๋ถ„๋ฅ˜, ๋‹จ์–ด ํ† ํฐ ๋ถ„๋ฅ˜ ๋˜๋Š” ์งˆ์˜์‘๋‹ต๊ณผ ๊ฐ™์€ ๋‹ค์šด์ŠคํŠธ๋ฆผ ์ž‘์—…์—์„œ ๋ฏธ์„ธ ์กฐ์ •์„ ํ†ตํ•ด ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Model Details

Model Description

  • Developed by: KISTI
  • Model type: deberta-v2
  • Language(s) (NLP): ํ•œ๊ธ€(ko)

Model Sources

Uses

Downstream Use - Load model directly

git clone https://huggingface.co/kisti/korscideberta; cd korscideberta

  • korscideberta-abstractcls.ipynb

from tokenization_korscideberta import DebertaV2Tokenizer

from transformers import AutoModelForSequenceClassification

tokenizer = DebertaV2Tokenizer.from_pretrained("kisti/korscideberta")

model = AutoModelForSequenceClassification.from_pretrained("kisti/korscideberta", num_labels=6, hidden_dropout_prob=0.1, attention_probs_dropout_prob=0.1)

#model = AutoModelForMaskedLM.from_pretrained("kisti/korscideberta")

''''''

train_metrics = trainer.train().metrics

trainer.save_metrics("train", train_metrics)

trainer.push_to_hub()

Out-of-Scope Use

์ด ๋ชจ๋ธ์€ ์˜๋„์ ์œผ๋กœ ์‚ฌ๋žŒ๋“ค์—๊ฒŒ ์ ๋Œ€์ ์ด๋‚˜ ์†Œ์™ธ๋œ ํ™˜๊ฒฝ์„ ์กฐ์„ฑํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋˜์–ด์„œ๋Š” ์•ˆ ๋ฉ๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ '๊ณ ์œ„ํ—˜ ์„ค์ •'์—์„œ ์‚ฌ์šฉ๋  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ ์‚ฌ๋žŒ์ด๋‚˜ ์‚ฌ๋ฌผ์— ๋Œ€ํ•œ ์ค‘์š”ํ•œ ๊ฒฐ์ •์„ ๋‚ด๋ฆด ์ˆ˜ ์žˆ๊ฒŒ ์„ค๊ณ„๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ์˜ ์ถœ๋ ฅ๋ฌผ์€ ์‚ฌ์‹ค์ด ์•„๋‹ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. '๊ณ ์œ„ํ—˜ ์„ค์ •'์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์‚ฌํ•ญ์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค: ์˜๋ฃŒ/์ •์น˜/๋ฒ•๋ฅ /๊ธˆ์œต ๋ถ„์•ผ์—์„œ์˜ ์‚ฌ์šฉ, ๊ณ ์šฉ/๊ต์œก/์‹ ์šฉ ๋ถ„์•ผ์—์„œ์˜ ์ธ๋ฌผ ํ‰๊ฐ€, ์ž๋™์œผ๋กœ ์ค‘์š”ํ•œ ๊ฒƒ์„ ๊ฒฐ์ •ํ•˜๊ธฐ, (๊ฐ€์งœ)์‚ฌ์‹ค์„ ์ƒ์„ฑํ•˜๊ธฐ, ์‹ ๋ขฐ๋„ ๋†’์€ ์š”์•ฝ๋ฌธ ์ƒ์„ฑ, ํ•ญ์ƒ ์˜ณ์•„์•ผ๋งŒ ํ•˜๋Š” ์˜ˆ์ธก ์ƒ์„ฑ ๋“ฑ.

Bias, Risks, and Limitations

์—ฐ๊ตฌ๋ชฉ์ ์œผ๋กœ ์ €์ž‘๊ถŒ ๋ฌธ์ œ๊ฐ€ ์—†๋Š” ๋ง๋ญ‰์น˜ ๋ฐ์ดํ„ฐ๋งŒ์„ ์‚ฌ์šฉํ•˜์˜€์Šต๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์˜ ์‚ฌ์šฉ์ž๋Š” ์•„๋ž˜์˜ ์œ„ํ—˜ ์š”์ธ๋“ค์„ ์ธ์‹ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ์šฉ๋œ ๋ง๋ญ‰์น˜๋Š” ๋Œ€๋ถ€๋ถ„ ์ค‘๋ฆฝ์ ์ธ ์„ฑ๊ฒฉ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š”๋ฐ๋„ ๋ถˆ๊ตฌํ•˜๊ณ , ์–ธ์–ด ๋ชจ๋ธ์˜ ํŠน์„ฑ์ƒ ์•„๋ž˜์™€ ๊ฐ™์€ ์œค๋ฆฌ ๊ด€๋ จ ์š”์†Œ๋ฅผ ์ผ๋ถ€ ํฌํ•จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค: ํŠน์ • ๊ด€์ ์— ๋Œ€ํ•œ ๊ณผ๋Œ€/๊ณผ์†Œ ํ‘œํ˜„, ๊ณ ์ • ๊ด€๋…, ๊ฐœ์ธ ์ •๋ณด, ์ฆ์˜ค/๋ชจ์š• ๋˜๋Š” ํญ๋ ฅ์ ์ธ ์–ธ์–ด, ์ฐจ๋ณ„์ ์ด๊ฑฐ๋‚˜ ํŽธ๊ฒฌ์ ์ธ ์–ธ์–ด, ๊ด€๋ จ์ด ์—†๊ฑฐ๋‚˜ ๋ฐ˜๋ณต์ ์ธ ์ถœ๋ ฅ ์ƒ์„ฑ ๋“ฑ.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

  • ๊ณผํ•™๊ธฐ์ˆ ๋ถ„์•ผ ํ† ํฌ๋‚˜์ด์ € (KorSci Tokenizer)
  • ๋ณธ ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์—์„œ ์‚ฌ์šฉ๋œ ์ฝ”ํผ์Šค๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ช…์‚ฌ ๋ฐ ๋ณตํ•ฉ๋ช…์‚ฌ ์•ฝ 600๋งŒ๊ฐœ์˜ ์‚ฌ์šฉ์ž์‚ฌ์ „์ด ์ถ”๊ฐ€๋œ Mecab-ko Tokenizer์™€ ๊ธฐ์กด SentencePiece-BPE๊ฐ€ ๋ณ‘ํ•ฉ๋˜์–ด์ง„ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ง๋ญ‰์น˜๋ฅผ ์ „์ฒ˜๋ฆฌํ•˜์˜€์Šต๋‹ˆ๋‹ค.
  • Total 128,100 words
  • Included special tokens ( < unk >, < cls >, < s >, < mask > )
  • File name : spm.model, vocab.txt

Training Hyperparameters

  • model_size: base
  • num_train_steps: 1,600,000
  • train_batch_size: 4,096 * 4 accumulative update = 16,384
  • learning_rate: 1e-4
  • max_seq_length: 512
  • vocab_size: 128,100
  • Training regime: fp16 mixed precision

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

๋ณธ ์–ธ์–ด๋ชจ๋ธ์˜ ์„ฑ๋Šฅํ‰๊ฐ€๋Š” ์—ฐ๊ตฌ๊ณผ์ œ๋ณด๊ณ ์„œ ๊ณผํ•™๊ธฐ์ˆ ํ‘œ์ค€๋ถ„๋ฅ˜ ํƒœ์Šคํฌ์— ํŒŒ์ธํŠœ๋‹ํ•˜์—ฌ ํ‰๊ฐ€ํ•˜๋Š” ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜์˜€์œผ๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ๋Š” ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • ์—ฐ๊ตฌ๊ณผ์ œ๋ณด๊ณ ์„œ ๊ณผํ•™๊ธฐ์ˆ ํ‘œ์ค€๋ถ„๋ฅ˜ ํ‰๊ฐ€ ๋ฐ์ดํ„ฐ์…‹(doi.org/10.23057/50), 145 Classes, 209,454 Training Set, 89,767 Test Set

Metrics

F1-micro/macro: ์ •๋‹ต Top3 ์ค‘ ์ตœ์†Œ 1๊ฐœ ์˜ˆ์ธก์‹œ ์„ฑ๊ณต ๊ธฐ์ค€ F1-strict: ์ •๋‹ต Top3 ์ค‘ ์˜ˆ์ธกํ•œ ์ˆ˜ ๋งŒํผ ์„ฑ๊ณต ๊ธฐ์ค€

Results

F1-micro: 0.85, F1-macro: 0.52, F1-strict: 0.71

Technical Specifications

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

KISTI ๊ตญ๊ฐ€์Šˆํผ์ปดํ“จํŒ…์„ผํ„ฐ NEURON ์‹œ์Šคํ…œ. HPE ClusterStor E1000, Lustre, Slurm

Hardware

NVIDIA A100 80G GPU 24EA

Software

Python 3.9, Cuda 11.8, PyTorch 1.10

Citation

ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ์ •๋ณด์—ฐ๊ตฌ์› (2023) : ํ•œ๊ตญ์–ด ๊ณผํ•™๊ธฐ์ˆ ๋ถ„์•ผ DeBERTa ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ (KorSciDeBERTa). Version 1.0. ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ์ •๋ณด์—ฐ๊ตฌ์›.

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors

๊น€๊ฒฝ๋ฏผ, ๊น€์€ํฌ, ๊น€์„ฑ์ฐฌ. ํ•œ๊ตญ๊ณผํ•™๊ธฐ์ˆ ์ •๋ณด์—ฐ๊ตฌ์› ์ธ๊ณต์ง€๋Šฅ๋ฐ์ดํ„ฐ์—ฐ๊ตฌ๋‹จ

Model Card Contact

๊น€๊ฒฝ๋ฏผ, kkmkorea kisti.re.kr