jphme commited on
Commit
07a6432
·
verified ·
1 Parent(s): 551f99a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -1
README.md CHANGED
@@ -3,13 +3,15 @@ library_name: transformers
3
  tags: []
4
  ---
5
 
 
 
6
  # Llama 3 DiscoLM German 8b v0.1 Experimental
7
 
8
  <p align="center"><img src="disco_llama.webp" width="400"></p>
9
 
10
  # Introduction
11
 
12
- **Llama 3 DiscoLM German 8b v0.1 Experimental** is an experimental Llama 3 based version of DiscoLM German.
13
 
14
  This is an experimental release and not intended for production use. The model is still in development and will be updated with new features and improvements in the future.
15
 
@@ -43,6 +45,46 @@ model.generate(**gen_input)
43
  When tokenizing messages for generation, set `add_generation_prompt=True` when calling `apply_chat_template()`. This will append `<|im_start|>assistant\n` to your prompt, to ensure
44
  that the model continues with an assistant response.
45
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
  # Limitations & Biases
48
 
 
3
  tags: []
4
  ---
5
 
6
+ *There currently is an issue with the **model generating random reserved special tokens (like "<|reserved_special_token_49|>") at the end**. Please use with `skip_special_tokens=true`. We will update once we found the reason for this behaviour. If you found a solution, plesae let us know!*.
7
+
8
  # Llama 3 DiscoLM German 8b v0.1 Experimental
9
 
10
  <p align="center"><img src="disco_llama.webp" width="400"></p>
11
 
12
  # Introduction
13
 
14
+ **Llama 3 DiscoLM German 8b v0.1 Experimental** is an experimental Llama 3 based version of [DiscoLM German](https://huggingface.co/DiscoResearch/DiscoLM_German_7b_v1).
15
 
16
  This is an experimental release and not intended for production use. The model is still in development and will be updated with new features and improvements in the future.
17
 
 
45
  When tokenizing messages for generation, set `add_generation_prompt=True` when calling `apply_chat_template()`. This will append `<|im_start|>assistant\n` to your prompt, to ensure
46
  that the model continues with an assistant response.
47
 
48
+ # Example Code for Inference
49
+
50
+ ```python
51
+ model_id = "DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental"
52
+
53
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
54
+ model = AutoModelForCausalLM.from_pretrained(
55
+ model_id,
56
+ torch_dtype=torch.bfloat16,
57
+ device_map="auto",
58
+ )
59
+
60
+ messages = [
61
+ {"role": "system", "content": "Du bist ein hilfreicher Assistent."},
62
+ {"role": "user", "content": "Wer bist du?"},
63
+ ]
64
+
65
+ input_ids = tokenizer.apply_chat_template(
66
+ messages,
67
+ add_generation_prompt=True,
68
+ return_tensors="pt"
69
+ ).to(model.device)
70
+
71
+ terminators = [
72
+ tokenizer.eos_token_id,
73
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
74
+ ]
75
+
76
+ outputs = model.generate(
77
+ input_ids,
78
+ max_new_tokens=256,
79
+ eos_token_id=terminators,
80
+ do_sample=True,
81
+ temperature=0.6,
82
+ top_p=0.9,
83
+ )
84
+ response = outputs[0][input_ids.shape[-1]:]
85
+ print(tokenizer.decode(response, skip_special_tokens=True))
86
+ ```
87
+
88
 
89
  # Limitations & Biases
90