File size: 2,191 Bytes
056d53a
 
 
19b191e
89416f1
58d99d4
 
 
 
 
056d53a
2c09f61
2d7748a
056d53a
e854a70
056d53a
e854a70
4bc83e6
ed5c33f
056d53a
 
 
 
 
 
 
 
 
 
 
 
990f234
 
0d64c06
a31614e
e854a70
 
ba2d0a9
e854a70
ba2d0a9
e854a70
ba2d0a9
e854a70
ba2d0a9
e854a70
 
 
0d64c06
e854a70
 
ba2d0a9
 
 
e854a70
ba2d0a9
e854a70
ba2d0a9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d539b80
 
 
048b524
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
language:
- id
pipeline_tag: text-generation
license: cc-by-nc-4.0
library_name: transformers
tags:
- llama
- alpaca
- lora
---
# About : 
This 🦙 Llama model was trained on a translated Alpaca dataset in Bahasa Indonesia. It uses Parameter Efficient Fine Tuning and LoRA to enable training on consumer-grade GPU hardware.

# How to Use : 

## Load the 🦙 Alpaca-LoRA model

```python
import torch
import bitsandbytes as bnb
from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig
from peft import PeftModel, PeftConfig, prepare_model_for_int8_training, LoraConfig, get_peft_model

peft_model_id = "firqaaa/indo-Alpaca-LoRA-7b"

tokenizer = LlamaTokenizer.from_pretrained("decapoda-research/llama-7b-hf")
model = LlamaForCausalLM.from_pretrained("decapoda-research/llama-7b-hf",
                                         load_in_8bit=True,
                                         device_map="auto")
# Load the LoRA model
model = PeftModel.from_pretrained(model, peft_model_id)
```
## Prompt Template
Prepare the prompt template

```python
instruction = "Tuliskan deret bilangan fibbonaci. Tulis jawaban/respons dalam Bahasa Indonesia."

PROMPT = f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{instruction}
### Response:"""
```

## Evaluation
feel free to change the parameters inside `GenerationConfig` to get better result.

```python
inputs = tokenizer(
    PROMPT,
    return_tensors="pt"
)
input_ids = inputs["input_ids"].cuda()

generation_config = GenerationConfig(
    temperature=0.1,
    top_p=0.95,
    top_k=40,
    num_beams=4,
    repetition_penalty=1.15,
)
print("Generating...")
print("Instruction : {}".format(instruction))

generation_output = model.generate(
    input_ids=input_ids,
    generation_config=generation_config,
    return_dict_in_generate=True,
    output_scores=True,
    max_new_tokens=512,
)
print("Response : ")
for s in generation_output.sequences:
    print(tokenizer.decode(s).split("### Response:")[1])
```

## Note :
Due to the high loss and lack of compute unit, we will update this model frequently to ensure the quality of generated text