Xilabs
/

instructmix-llama-3b

Text Generation

Model card Files Files and versions Community

ritabratamaiti commited on Aug 14, 2023

Commit

b4f00ea

·

1 Parent(s): c04d8ce

Update README.md

Files changed (1) hide show

README.md +84 -0

README.md CHANGED Viewed

@@ -4,6 +4,90 @@ datasets:
 - Xilabs/instructmix
 pipeline_tag: text-generation
 ---
 ## Training procedure

 - Xilabs/instructmix
 pipeline_tag: text-generation
 ---
+## Model Card for "InstructMix Llama 3B"
+**Model Name:** InstructMix Llama 3B
+**Description:**
+InstructMix Llama 3B is a language model fine-tuned on the InstructMix dataset using parameter-efficient fine-tuning (PEFT), using the base model "openlm-research/open_llama_3b_v2," which can be found at [https://huggingface.co/openlm-research/open_llama_3b_v2](https://huggingface.co/openlm-research/open_llama_3b_v2).
+**Usage:**
+```py
+import torch
+from transformers import LlamaForCausalLM, LlamaTokenizer
+from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
+from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig
+from peft import PeftModel, PeftConfig
+# Hugging Face model_path
+model_path = 'openlm-research/open_llama_3b_v2'
+peft_model_id = 'Xilabs/instructmix-llama-3b'
+tokenizer = LlamaTokenizer.from_pretrained(model_path)
+model = LlamaForCausalLM.from_pretrained(
+    model_path, device_map="auto"
+)
+model = PeftModel.from_pretrained(model, peft_model_id)
+def generate_prompt(instruction, input=None):
+    if input:
+        return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
+### Instruction:
+{instruction}
+### Input:
+{input}
+### Response:"""
+    else:
+        return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.
+### Instruction:
+{instruction}
+### Response:"""
+def evaluate(
+    instruction,
+    input=None,
+    temperature=0.5,
+    top_p=0.75,
+    top_k=40,
+    num_beams=5,
+    max_new_tokens=128,
+    **kwargs,
+):
+    prompt = generate_prompt(instruction, input)
+    inputs = tokenizer(prompt, return_tensors="pt")
+    input_ids = inputs["input_ids"].to("cuda")
+    generation_config = GenerationConfig(
+        temperature=temperature,
+        top_p=top_p,
+        top_k=top_k,
+        num_beams=num_beams,
+        early_stopping=True,
+        repetition_penalty=1.1,
+        **kwargs,
+    )
+    with torch.no_grad():
+        generation_output = model.generate(
+            input_ids=input_ids,
+            generation_config=generation_config,
+            return_dict_in_generate=True,
+            output_scores=True,
+            max_new_tokens=max_new_tokens,
+        )
+    s = generation_output.sequences[0]
+    output = tokenizer.decode(s, skip_special_tokens = True)
+    #print(output)
+    return output.split("### Response:")[1]
+# Sample Test Instruction Used by Youtuber Sam Witteveen https://www.youtube.com/@samwitteveenai
+instruction = "What is the meaning of life?"
+print(evaluate(instruction, num_beams=3, temperature=0.1, max_new_tokens=256))
+```
 ## Training procedure