bofenghuang
/

vigogne-7b-instruct

+---
+license: mit
+language: fr
+thumbnail: null
+---
+# Vigogne: French Instruct LLaMA
+This repo contains a low-rank adapter for LLaMA-7b fit on the [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) dataset.
+Instructions for running it can be found at https://github.com/tloen/alpaca-lora.
+## Usage
+```python
+import torch
+from peft import PeftModel
+from transformers import GenerationConfig, LlamaForCausalLM, LlamaTokenizer
+PROMPT_DICT = {
+    "prompt_input": (
+        "Ci-dessous se trouve une instruction qui décrit une tâche, associée à une entrée qui fournit un contexte supplémentaire. Écrivez une réponse qui complète correctement la demande.\n\n"
+        "### Instruction:\n{instruction}\n\n### Entrée:\n{input}\n\n### Réponse:\n"
+    ),
+    "prompt_no_input": (
+        "Ci-dessous se trouve une instruction qui décrit une tâche. Écrivez une réponse qui complète correctement la demande.\n\n"
+        "### Instruction:\n{instruction}\n\n### Réponse:\n"
+    ),
+}
+device = "cuda"
+tokenizer = LlamaTokenizer.from_pretrained("decapoda-research/llama-7b-hf")
+model = LlamaForCausalLM.from_pretrained(
+    "decapoda-research/llama-7b-hf",
+    load_in_8bit=True,
+    torch_dtype=torch.float16,
+    device_map="auto",
+)
+model = PeftModel.from_pretrained(
+    model,
+    "bofenghuang/vigogne-lora-7b",
+    torch_dtype=torch.float16,
+)
+def instruct(
+    instruction,
+    input=None,
+    temperature=0.1,
+    top_p=1.0,
+    max_new_tokens=512,
+    **kwargs,
+):
+    prompt = (
+        PROMPT_DICT["prompt_input"].format_map({"instruction": instruction, "input": input})
+        if input is not None
+        else PROMPT_DICT["prompt_no_input"].format_map({"instruction": instruction})
+    )
+    tokenized_inputs = tokenizer(prompt, return_tensors="pt")
+    input_ids = tokenized_inputs["input_ids"].to(device)
+    generation_config = GenerationConfig(
+        temperature=temperature,
+        top_p=top_p,
+        **kwargs,
+    )
+    with torch.inference_mode():
+        generation_output = model.generate(
+            input_ids=input_ids,
+            generation_config=generation_config,
+            return_dict_in_generate=True,
+            output_scores=True,
+            max_new_tokens=max_new_tokens,
+        )
+    s = generation_output.sequences[0]
+    output = tokenizer.decode(s)
+    return output.split("### Réponse:")[1].strip()
+# instruct
+instruct("Expliquer le théorème central limite.")
+# Le théorème central limite stipule que la loi de la moyenne des valeurs aléatoires d'une série de variables aléatoires est la loi normale.
+# Cela signifie que la moyenne des valeurs aléatoires d'une série de variables aléatoires tend vers la loi normale, indépendamment de la taille de la série.
+# instruct + input
+instruct(
+    "Traduisez le texte suivant en français.",
+    input="Caterpillars extract nutrients which are then converted into butterflies. People have extracted billions of nuggets of understanding and GPT-4 is humanity's butterfly.",
+)
+# Les papillons de nuit extraient des nutriments qui sont ensuite convertis en papillons. Les gens ont extrait des milliards de nuggets de compréhension et GPT-4 est la butterfly de l'humanité.
+```
+## Todo
+- Add output examples
+- Open source github repo