File size: 3,392 Bytes
c46bf40
 
 
d6eb63c
c46bf40
 
 
 
 
 
 
 
 
 
69e0ebb
e3f2c80
181e75d
a780aa0
1a8606b
 
6f47db7
c46bf40
 
e3f2c80
 
 
 
 
 
147ba5e
e3f2c80
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d89b10f
e3f2c80
 
c46bf40
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
datasets:
- tatsu-lab/alpaca
- the_pile
language:
- en
library_name: transformers
tags:
- peft
- lora
- instruct
- alpaca
- gptj
---
# Instruct-GPT-J "Vicuña"

<p style="color: green"> By 3/12/2023 this model will most likely be much better because I actually read the LoRa paper and found cleaner data and trained for longer with better hyperparameters (Rank 8 adaption on all QKVO for 200 steps @ bs 64). </p>

A demo that runs in free Google Colab can be run here: https://bit.ly/3K1P4PQ just change the model dropdown to the name of this model.

The [EleutherAI/gpt-j-6B](https://hf.co/EleutherAI/gpt-j-6B) model finetuned on the [Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca) instruction dataset with [low rank adaptation](https://arxiv.org/abs/2106.09685). This is not a model from Eleuther but a personal project.

## Use:

```python
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

peft_model_id = "crumb/Instruct-GPT-J"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto', revision='sharded')
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

# Load the Lora model
model = PeftModel.from_pretrained(model, peft_model_id)

# This example is in the alpaca training set
batch = tokenizer("Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: How can we reduce air pollution? ### Response:", return_tensors='pt')

with torch.cuda.amp.autocast():
  output_tokens = model.generate(**batch, max_new_tokens=50)

print(tokenizer.decode(output_tokens[0], skip_special_tokens=True))
# One way to reduce air pollution is to reduce the amount of emissions from vehicles. This can be done by implementing stricter emission standards and increasing the use of electric vehicles. Another way to reduce air pollution is to reduce the amount of waste produced by industries.
```

A function to turn an instruction into a prompt for the model could be written as follows

```python
def prompt(instruction, input=''):
  if input=='':
    return f"Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: {instruction} ### Response: "
  return f"Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {instruction} ### Input: {input} ### Response: "
```

Where input would be an input for the model to act on based on the instruction.

### citations

```bibtex
@misc{gpt-j,
  author = {Wang, Ben and Komatsuzaki, Aran},
  title = {{GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model}},
  howpublished = {\url{https://github.com/kingoflolz/mesh-transformer-jax}},
  year = 2021,
  month = May
}
```

```bibtex
@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}
```