- Developed by: lwef
- License: apache-2.0
- Finetuned from model : beomi/Llama-3-Open-Ko-8B
korean dialogue summary fine-tuned model
how to use
prompt_template = '''
μλ λνλ₯Ό μμ½ν΄ μ£ΌμΈμ. λν νμμ '#λν μ°Έμ¬μ#: λν λ΄μ©'μ
λλ€.
### λν >>>{dialogue}
### μμ½ >>>'''
if True:
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "lwef/llm-bench-upload-1", # YOUR MODEL YOU USED FOR TRAINING
max_seq_length = 2048,
dtype = None,
load_in_4bit = True,
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
dialogue = '''#P01#: μ νμΆ κ³Όμ λ무 μ΄λ €μ... 5μͺ½ μΈκ² μλλ° γ
‘γ
‘ #P02#: λͺ¬λλͺ¬λλκ°λμμ¨ γ
γ
#P01#: 5μͺ½ λμΆ© μμμ νλ¦λλ‘ μ μ¨μΌμ§..μ΄μ 1μͺ½μ ;; 5μͺ½ μλ λ€μ€λ§ μ μ΄μΌμ§ #P02#: μλ... λκ°λΆλμ€μν κ±°κ°μ κ±°μκ½μ±μμμ°μ
#P01#: λͺ»μ¨ μΈλ§μ
μ¨ #P02#: μ΄κ±°μ€κ°λ체μ¬?? #P01#: γ΄γ΄ κ·Έλ₯ κ³Όμ μ κ·Έλμ λ μ§μ¦λ¨'''
formatted_prompt = prompt_template.format(dialogue=dialogue)
# ν ν¬λμ΄μ§
inputs = tokenizer(
formatted_prompt,
return_tensors="pt"
).to("cuda")
outputs = model.generate(
**inputs,
max_new_tokens = 128,
eos_token_id=tokenizer.eos_token_id, # EOS ν ν°μ μ¬μ©νμ¬ λͺ
μμ μΌλ‘ μΆλ ₯μ λμ μ§μ .
use_cache = True
)
decoded_outputs = tokenizer.batch_decode(outputs, skip_special_tokens=True)
result = decoded_outputs[0]
print(result)
result = result.split('### μμ½ >>>')[-1].strip()
print(result)
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
- Downloads last month
- 3,503
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.