from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("hiieu/gemma-2-2b-it-lora-vi-en")
tokenizer = tokenizer = AutoTokenizer.from_pretrained("hiieu/gemma-2-2b-it-lora-vi-en")
conversations = [
[{"role": "user", "content": "Good morning everybody"}],
[{"role": "user", "content": "Xin chào mọi người"}]
]
batch_input_ids = tokenizer.apply_chat_template(
conversations,
add_generation_prompt=True,
return_tensors="pt",
padding=True,
truncation=True
).to(model.device)
outputs = model.generate(
batch_input_ids,
max_new_tokens=256,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
responses = outputs[:, batch_input_ids.shape[-1]:]
for response in responses:
print(tokenizer.decode(response, skip_special_tokens=True))
>>> Chào mọi người
>>> Hello everyone
Uploaded model
- Developed by: hiieu
- License: apache-2.0
- Finetuned from model : unsloth/gemma-2-2b-it-bnb-4bit
This gemma2 model was trained 2x faster with Unsloth and Huggingface's TRL library.
Model tree for hiieu/gemma-2-2b-it-lora-vi-en
Base model
unsloth/gemma-2-2b-it-bnb-4bit