Junhoee's picture
Update README.md
5650d47 verified
metadata
language:
  - ko
metrics:
  - bleu
pipeline_tag: text2text-generation

๐ŸŒŠ ์ œ์ฃผ์–ด, ํ‘œ์ค€์–ด ์–‘๋ฐฉํ–ฅ ๋ฒˆ์—ญ ๋ชจ๋ธ (Jeju-Standard Bidirectional Translation Model)

1. Introduction

๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘Member

  • Bitamin 12๊ธฐ : ๊ตฌ์ค€ํšŒ, ์ด์„œํ˜„, ์ด์˜ˆ๋ฆฐ
  • Bitamin 13๊ธฐ : ๊น€์œค์˜, ๊น€์žฌ๊ฒธ, ์ดํ˜•์„

Github Link

How to use this Model

  • You can use this model with transformers to perform inference.
  • Below is an example of how to load the model and generate translations:
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

## Set up the device (GPU or CPU)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

## Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("Junhoee/Kobart-Jeju-translation")
model = AutoModelForSeq2SeqLM.from_pretrained("Junhoee/Kobart-Jeju-translation").to(device)

## Set up the input text
## ๋ฌธ์žฅ ์ž…๋ ฅ ์ „์— ๋ฐฉํ–ฅ์— ๋งž๊ฒŒ [์ œ์ฃผ] or [ํ‘œ์ค€] ํ† ํฐ์„ ์ž…๋ ฅ ํ›„ ๋ฌธ์žฅ ์ž…๋ ฅ
input_text = "[ํ‘œ์ค€] ์•ˆ๋…•ํ•˜์„ธ์š”"

## Tokenize the input text
input_ids = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True).input_ids.to(device)

## Generate the translation
outputs = model.generate(input_ids, max_length=64)

## Decode and print the output
decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Model Output:", decoded_output)
Model Output: ์•ˆ๋…•ํ•˜์ˆ˜๊ฝˆ

## Set up the input text
## ๋ฌธ์žฅ ์ž…๋ ฅ ์ „์— ๋ฐฉํ–ฅ์— ๋งž๊ฒŒ [์ œ์ฃผ] or [ํ‘œ์ค€] ํ† ํฐ์„ ์ž…๋ ฅ ํ›„ ๋ฌธ์žฅ ์ž…๋ ฅ
input_text = "[์ œ์ฃผ] ์•ˆ๋…•ํ•˜์ˆ˜๊ฝˆ"

## Tokenize the input text
input_ids = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True).input_ids.to(device)

## Generate the translation
outputs = model.generate(input_ids, max_length=64)

## Decode and print the output
decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Model Output:", decoded_output)
Model Output: ์•ˆ๋…•ํ•˜์„ธ์š”

Parent Model

2. Dataset - ์•ฝ 93๋งŒ ๊ฐœ์˜ ํ–‰

  • AI-Hub (์ œ์ฃผ์–ด ๋ฐœํ™” ๋ฐ์ดํ„ฐ + ์ค‘๋…„์ธต ๋ฐฉ์–ธ ๋ฐœํ™” ๋ฐ์ดํ„ฐ)
  • Github (์นด์นด์˜ค๋ธŒ๋ ˆ์ธ JIT ๋ฐ์ดํ„ฐ)
  • ๊ทธ ์™ธ
    • ์ œ์ฃผ์–ด์‚ฌ์ „ ๋ฐ์ดํ„ฐ (์ œ์ฃผ๋„์ฒญ ํ™ˆํŽ˜์ด์ง€์—์„œ ํฌ๋กค๋ง)
    • ๊ฐ€์‚ฌ ๋ฒˆ์—ญ ๋ฐ์ดํ„ฐ (๋ญ๋žญํ•˜๋งจ ์œ ํŠœ๋ธŒ์—์„œ ์ผ์ผ์ด ์ˆ˜์ง‘)
    • ๋„์„œ ๋ฐ์ดํ„ฐ (์ œ์ฃผ๋ฐฉ์–ธ ๊ทธ ๋ง›๊ณผ ๋ฉ‹, ๋ถ€์—๋‚˜๋„ ์ง€๊บผ์ ธ๋„ ๋„์„œ์—์„œ ์ผ์ผ์ด ์ˆ˜์ง‘)
    • 2018๋…„๋„ ์ œ์ฃผ์–ด ๊ตฌ์ˆ  ์ž๋ฃŒ์ง‘ (์ผ์ผ์ด ์ˆ˜์ง‘ - ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ๋กœ ์‚ฌ์šฉ)

3. Hyper Parameters

  • Epoch : 3 epochs
  • Learning Rate : 2e-5
  • Weight Decay=0.01
  • Batch Size : 32

4. Bleu Score

  • 2018 ์ œ์ฃผ์–ด ๊ตฌ์ˆ  ์ž๋ฃŒ์ง‘ ๋ฐ์ดํ„ฐ ๊ธฐ์ค€

    • ์ œ์ฃผ์–ด -> ํ‘œ์ค€์–ด : 0.76
    • ํ‘œ์ค€์–ด -> ์ œ์ฃผ์–ด : 0.5
  • AI-Hub ์ œ์ฃผ์–ด ๋ฐœํ™” ๋ฐ์ดํ„ฐ์˜ validation data ๊ธฐ์ค€

    • ์ œ์ฃผ์–ด -> ํ‘œ์ค€์–ด : 0.89
    • ํ‘œ์ค€์–ด -> ์ œ์ฃผ์–ด : 0.77

5. CREDIT