Update README.md
Browse files
README.md
CHANGED
@@ -3,13 +3,15 @@ library_name: transformers
|
|
3 |
tags: []
|
4 |
---
|
5 |
|
|
|
|
|
6 |
# Llama 3 DiscoLM German 8b v0.1 Experimental
|
7 |
|
8 |
<p align="center"><img src="disco_llama.webp" width="400"></p>
|
9 |
|
10 |
# Introduction
|
11 |
|
12 |
-
**Llama 3 DiscoLM German 8b v0.1 Experimental** is an experimental Llama 3 based version of DiscoLM German.
|
13 |
|
14 |
This is an experimental release and not intended for production use. The model is still in development and will be updated with new features and improvements in the future.
|
15 |
|
@@ -43,6 +45,46 @@ model.generate(**gen_input)
|
|
43 |
When tokenizing messages for generation, set `add_generation_prompt=True` when calling `apply_chat_template()`. This will append `<|im_start|>assistant\n` to your prompt, to ensure
|
44 |
that the model continues with an assistant response.
|
45 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
|
47 |
# Limitations & Biases
|
48 |
|
|
|
3 |
tags: []
|
4 |
---
|
5 |
|
6 |
+
*There currently is an issue with the **model generating random reserved special tokens (like "<|reserved_special_token_49|>") at the end**. Please use with `skip_special_tokens=true`. We will update once we found the reason for this behaviour. If you found a solution, plesae let us know!*.
|
7 |
+
|
8 |
# Llama 3 DiscoLM German 8b v0.1 Experimental
|
9 |
|
10 |
<p align="center"><img src="disco_llama.webp" width="400"></p>
|
11 |
|
12 |
# Introduction
|
13 |
|
14 |
+
**Llama 3 DiscoLM German 8b v0.1 Experimental** is an experimental Llama 3 based version of [DiscoLM German](https://huggingface.co/DiscoResearch/DiscoLM_German_7b_v1).
|
15 |
|
16 |
This is an experimental release and not intended for production use. The model is still in development and will be updated with new features and improvements in the future.
|
17 |
|
|
|
45 |
When tokenizing messages for generation, set `add_generation_prompt=True` when calling `apply_chat_template()`. This will append `<|im_start|>assistant\n` to your prompt, to ensure
|
46 |
that the model continues with an assistant response.
|
47 |
|
48 |
+
# Example Code for Inference
|
49 |
+
|
50 |
+
```python
|
51 |
+
model_id = "DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental"
|
52 |
+
|
53 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
54 |
+
model = AutoModelForCausalLM.from_pretrained(
|
55 |
+
model_id,
|
56 |
+
torch_dtype=torch.bfloat16,
|
57 |
+
device_map="auto",
|
58 |
+
)
|
59 |
+
|
60 |
+
messages = [
|
61 |
+
{"role": "system", "content": "Du bist ein hilfreicher Assistent."},
|
62 |
+
{"role": "user", "content": "Wer bist du?"},
|
63 |
+
]
|
64 |
+
|
65 |
+
input_ids = tokenizer.apply_chat_template(
|
66 |
+
messages,
|
67 |
+
add_generation_prompt=True,
|
68 |
+
return_tensors="pt"
|
69 |
+
).to(model.device)
|
70 |
+
|
71 |
+
terminators = [
|
72 |
+
tokenizer.eos_token_id,
|
73 |
+
tokenizer.convert_tokens_to_ids("<|eot_id|>")
|
74 |
+
]
|
75 |
+
|
76 |
+
outputs = model.generate(
|
77 |
+
input_ids,
|
78 |
+
max_new_tokens=256,
|
79 |
+
eos_token_id=terminators,
|
80 |
+
do_sample=True,
|
81 |
+
temperature=0.6,
|
82 |
+
top_p=0.9,
|
83 |
+
)
|
84 |
+
response = outputs[0][input_ids.shape[-1]:]
|
85 |
+
print(tokenizer.decode(response, skip_special_tokens=True))
|
86 |
+
```
|
87 |
+
|
88 |
|
89 |
# Limitations & Biases
|
90 |
|