Model Description

kobart-oxquizλŠ” OX ν€΄μ¦ˆ 생성을 μœ„ν•΄ fine-tuning 된 KoBART λͺ¨λΈμž…λ‹ˆλ‹€.
μ‚¬μš©μžκ°€ μž…λ ₯ν•œ λ¬Έλ§₯(context)κ³Ό μ •λ‹΅ μ—¬λΆ€(ox)λ₯Ό 기반으둜 κ΄€λ ¨λœ ν€΄μ¦ˆ λ¬Έμž₯을 μƒμ„±ν•©λ‹ˆλ‹€.
이 λͺ¨λΈμ€ gogamza/kobart-base-v2을 fine-tuning ν•œ λͺ¨λΈμž…λ‹ˆλ‹€.
μƒμ„±λœ λ¬Έμž₯이 특히 반볡 λΆ€λΆ„μ—μ„œ μ·¨μ•½ν•˜κΈ° λ•Œλ¬Έμ— μ•„λž˜ μ˜ˆμ‹œμ™€ 같이 μ—¬λŸ¬ λ§€κ°œλ³€μˆ˜λ₯Ό μ΄μš©ν•˜μ—¬ μ‘°μ •ν•œ 후에 μ‚¬μš©ν•˜μ‹œλŠ” 것을 μΆ”μ²œλ“œλ¦½λ‹ˆλ‹€.

How to use

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import torch
import re

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 질문 생성 ν•¨μˆ˜
def generate_ox(model, tokenizer, context, ox):
    ox_str = "True" if ox else "False"
    input_text = f"context: {context}\n ox: {ox_str}"
    inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding=True).to(device)
    inputs.pop("token_type_ids", None)
    with torch.no_grad(): output_ids = model.generate(input_ids=inputs["input_ids"], attention_mask=inputs["attention_mask"], max_length=25, do_sample=True, top_k=50, top_p=0.9, repetition_penalty=3.0, no_repeat_ngram_size=2, num_beams=10, early_stopping=True)
    question = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    question = re.sub(r'λ‹€(\.λ‹€)+\.', 'λ‹€.', question)
    question = re.sub(r'\.\.+', '.', question)
    question = re.sub(r'(?<=λ‹€\.)(?=[^\s])', ' ', question)
    question = question.strip()
    last_period_index = question.rfind('.')
    if last_period_index:
        question = question[:last_period_index+1]
    return question

# λͺ¨λΈκ³Ό ν† ν¬λ‚˜μ΄μ € λ‘œλ“œ
model_name = "asteroidddd/kobart-oxquiz"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name).to(device)
model.eval()

# μ˜ˆμ‹œ 좜λ ₯
print("\n[Example]")
context =  "μ˜¬ν•΄ ν•œκ΅­μ˜ GDP μ„±μž₯λ₯ μ€ 2.5%둜 졜근 10λ…„ 사이 κ°€μž₯ 높은 수치둜 μ˜ˆμƒλœλ‹€."
question_T = generate_ox(model, tokenizer, context, True)
question_F = generate_ox(model, tokenizer, context, False)

print(f"Context\t\t: {context}")
print(f"True question\t: {question_T}")
print(f"False question\t: {question_F}\n")

Training Data

[μ›”κ°„ 데이콘 ν•œκ΅­μ–΄ λ¬Έμž₯ 관계 λΆ„λ₯˜ κ²½μ§„λŒ€νšŒ]
원본 데이터셋은 Premise(λ¬Έμž₯), Hypothesis(κ°€μ„€), Label(관계) 둜 이루어진 NLI(μžμ—°μ–΄ μΆ”λ‘ ) λ°μ΄ν„°μž…λ‹ˆλ‹€.
input μœΌλ‘œλŠ” PremiseλŠ” context둜, Label은 ox둜 이름을 λ³€κ²½ν•˜μ—¬ μ‚¬μš©ν•˜μ˜€μŠ΅λ‹ˆλ‹€.
λͺ©ν‘œ output μœΌλ‘œλŠ” Hypothesisλ₯Ό question으둜 이름 λ³€κ²½ν•œ ν›„, Hypothesis의 값인 Entailment을 True둜, Contradiction을 False둜 λ³€ν™˜ν•˜κ³ , Neutral은 μ œμ™Έν•˜μ—¬ μ‚¬μš©ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

Downloads last month
10
Safetensors
Model size
124M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support