francislabounty commited on
Commit
7870868
1 Parent(s): a7e57d8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md CHANGED
@@ -1,3 +1,60 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ ---
5
+ license: mit
6
+ ---
7
+ LoRA weights and for research only - nothing from the foundation model.
8
+ Trained use Anthropics HH dataset which can be found here https://huggingface.co/datasets/Anthropic/hh-rlhf
9
+
10
+ Sample usage
11
+ ```
12
+ import torch
13
+ import os
14
+ import transformers
15
+ from peft import PeftModel
16
+ from transformers import LlamaTokenizer, LlamaForCausalLM
17
+
18
+ model_path = "decapoda-research/llama-30b-hf"
19
+ peft_path = 'serpdotai/llama-hh-lora-30B'
20
+ tokenizer_path = 'decapoda-research/llama-30b-hf'
21
+
22
+ model = LlamaForCausalLM.from_pretrained(model_path, load_in_8bit=True, device_map="auto") # or something like {"": 0}
23
+ model = PeftModel.from_pretrained(model, peft_path, torch_dtype=torch.float16, device_map="auto") # or something like {"": 0}
24
+ tokenizer = LlamaTokenizer.from_pretrained(tokenizer_path)
25
+
26
+ batch = tokenizer("\n\nUser: Are you sentient?\n\nAssistant:", return_tensors="pt")
27
+
28
+ with torch.no_grad():
29
+ out = model.generate(
30
+ input_ids=batch["input_ids"].cuda(),
31
+ attention_mask=batch["attention_mask"].cuda(),
32
+ max_length=100,
33
+ do_sample=True,
34
+ top_k=50,
35
+ top_p=1.0,
36
+ temperature=1.0,
37
+ use_cache=False
38
+ )
39
+ print(tokenizer.decode(out[0]))
40
+ ```
41
+
42
+ The model will continue the conversation between the user and itself. If you want to use as a chatbot you can alter the generate method to include stop sequences for 'User:' and 'Assistant:' or strip off anything past the assistant's original response before returning.
43
+
44
+
45
+ Trained for 2 epochs with a sequence length of 368, mini-batch size of 1, gradient accumulation of 15, on 8 A6000s for an effective batch size of 120.
46
+
47
+ Training settings:
48
+ - lr: 2.0e-04
49
+ - lr_scheduler_type: linear
50
+ - warmup_ratio: 0.06
51
+ - weight_decay: 0.1
52
+ - optimizer: adamw_torch_fused
53
+
54
+ LoRA config:
55
+ - target_modules: ['q_proj', 'k_proj', 'v_proj', 'o_proj']
56
+ - r: 64
57
+ - lora_alpha: 32
58
+ - lora_dropout: 0.05
59
+ - bias: "none"
60
+ - task_type: "CAUSAL_LM"