metadata
language:
- id
pipeline_tag: text-generation
About :
This is 🦙 LlaMA model that trained on translated Alpaca dataset in Bahasa Indonesia. It utilize the Parameter Efficient Fine Tuning and LoRA to be able trained on consumer hardware GPU.
How to Use :
Load the 🦙 Alpaca-LoRA model
import torch
import bitsandbytes as bnb
from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig
from peft import PeftModel, PeftConfig, prepare_model_for_int8_training, LoraConfig, get_peft_model
peft_model_id = "firqaaa/indo-Alpaca-LoRA-7b"
tokenizer = LlamaTokenizer.from_pretrained("decapoda-research/llama-7b-hf")
model = LlamaForCausalLM.from_pretrained("decapoda-research/llama-7b-hf",
load_in_8bit=True,
device_map="auto")
# Load the LoRA model
model = PeftModel.from_pretrained(model, peft_model_id)
Prompt Template
def generate_prompt(instruction, input=None):
if input:
return f"""Berikut ini adalah petunjuk yang menjelaskan tugas, serta masukan yang menyediakan konteks tambahan. Tulis balasan yang melengkapi permintaan dengan tepat.
Petunjuk:
{instruction}
Masukan:
{input}
Output:"""
else:
return f"""Berikut ini terdapat panduan yang menjelaskan tugas. Mohon tuliskan balasan yang melengkapi permintaan dengan tepat.
Panduan:
{instruction}
Output:"""
Evaluation
feel free to change the parameters inside GenerationConfig
to get better result.
generation_config = GenerationConfig(
temperature=0.2,
top_p=0.75,
num_beams=8
)
def evaluate(instruction, input=None):
prompt = generate_prompt(instruction, input)
inputs = tokenizer(prompt, return_tensors="pt")
input_ids = inputs["input_ids"].cuda()
generation_output = model.generate(
input_ids=input_ids,
generation_config=generation_config,
return_dict_in_generate=True,
output_scores=True,
max_new_tokens=256
)
for s in generation_output.sequences:
output = tokenizer.decode(s)
print("Output:", output.split("Output:")[1].strip())
# input your question/instruction
evaluate(input("Petunjuk: "))
Note :
Due to high loss and lack of compute unit, we will update this model frequently so it can generate better result