Model Card for TwinDoc/RedWhale-2-12B-Instruct
์ฌ์ ํ์ต ๋ชจ๋ธ์ธ TwinDoc/RedWhale-2-12B๋ฅผ SFT(Supervised Finetuning)ํ ๋ชจ๋ธ์ ๋๋ค. SFT๋ ContextQA ๋ฐ ์์ฝ task์ ๋ง์ถฐ ํ์ตํ์์ต๋๋ค.
Model Details
Model Description
This is the model card of a ๐ค transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: AgileSoda
- Model type: Llama
- Language(s) (NLP): ํ๊ตญ์ด
- License: [More Information Needed]
- Foundation Model: RedWhale-2-12B
Model Sources [optional]
- Repository: [More Information Needed]
- Paper [optional]: [More Information Needed]
- Demo [optional]: [More Information Needed]
Uses
Direct Use
RedWhale-2-12B-Instruct ๋ชจ๋ธ ์ฌ์ฉ ๋ฐฉ๋ฒ์ meta-llama/Llama-3.1-8B-Instruct ๋ชจ๋ธ ์ฌ์ฉ ๋ฐฉ๋ฒ๊ณผ ๋์ผํฉ๋๋ค.
์ฌ์ฉํ๊ณ ์ ํ๋ ์๋น ์์ง์ ๊ณต์ ๋ฌธ์๋ฅผ ์ฐธ๊ณ ํ์ธ์. ๋ค์์ ์์์
๋๋ค.
usage with Transformers
์์ ์ฝ๋๋ transformers == 4.48.1์์ ์์ฑ๋์์ต๋๋ค.
from transformers import AutoModelForCausalLM,AutoTokenizer
import torch
loading_args = {"torch_dtype": torch.bfloat16, "device_map": "auto"} ## for multi gpu loading
model = AutoModelForCausalLM.from_pretrained("TwinDoc/RedWhale-2-12B-Instruct",**loading_args)
tokenizer = AutoTokenizer.from_pretrained("TwinDoc/RedWhale-2-12B-Instruct")
messages = [
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "๋ํ๋ฏผ๊ตญ์ ์๋๋?"},
]
inputs = tokenizer.apply_chat_template(messages,add_generation_prompt=True,return_tensors="pt")
outputs = model.generate(inputs)
>>> print(outputs)
"<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Cutting Knowledge Date: December 2023
Today Date: 26 Jul 2024
You are a helpful AI assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>
๋ํ๋ฏผ๊ตญ์ ์๋๋?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
๋ํ๋ฏผ๊ตญ์ ์๋๋ ์์ธ์
๋๋ค.<|eot_id|>"
usage with vllm
์์ ์ฝ๋๋ vllm == 0.6.6์์ ์์ฑ๋์์ต๋๋ค.
from vllm import LLM, SamplingParams
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # Arrange GPU devices starting from 0
os.environ["CUDA_VISIBLE_DEVICES"]= "0,1,2,3,4,5,6,7"
repo_id = "TwinDoc/RedWhale-2-12B-Instruct"
tensor_parallel_size = 8 ## num of gpus
llm = LLM(
model=repo_id,
tensor_parallel_size=tensor_parallel_size,
)
messages = [
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "๋ํ๋ฏผ๊ตญ์ ์๋๋?"},
]
sampling_params = SamplingParams(
temperature=0.8,
top_p=0.9,
max_tokens = 8192,
)
outputs = llm.chat(messages,sampling_params)
>>> print(outputs[0].outputs[0].text)
๋ํ๋ฏผ๊ตญ์ ์๋๋ ์์ธ์
๋๋ค.
Training Details
Training Data
Evaluation
Testing Data, Factors & Metrics
Testing Data
- allganize/rag-ko test์ 200๊ฑด
- ๋ฏธ๋์์ Context QA 100๊ฑด
- AIA Context QA 140๊ฑด
- BNK Context QA 63๊ฑด
Metrics
LLM as a Judge์ ํ์ฉํ์ฌ ์ฑ๋ฅ์ ์ธก์ ํ์์ต๋๋ค. Prompt,์ธก์ ๋ชจ๋ธ ๊ทธ๋ฆฌ๊ณ ํ๊ฐ ๊ฒฐ๊ณผ๋ Our Leaderboard๋ฅผ ์ฐธ๊ณ ํ์ธ์. Our Leaderboard์ ๋ชจ๋ธ๋ช "RedWhale2 12B 0.98 SFT v4 M"์ด "TwinDoc/RedWhale-2-12B-Instruct" ๋ชจ๋ธ์ ๋๋ค.
- Downloads last month
- 20