prithivMLmods
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -12,3 +12,43 @@ tags:
|
|
12 |
Llama-Thinker-3B-Preview is a pretrained and instruction-tuned generative model designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.
|
13 |
|
14 |
Model Architecture: [ Based on Llama 3.2 ] is an autoregressive language model that uses an optimized transformer architecture. The tuned versions undergo supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
Llama-Thinker-3B-Preview is a pretrained and instruction-tuned generative model designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.
|
13 |
|
14 |
Model Architecture: [ Based on Llama 3.2 ] is an autoregressive language model that uses an optimized transformer architecture. The tuned versions undergo supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
|
15 |
+
|
16 |
+
# **Use with transformers**
|
17 |
+
|
18 |
+
Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
|
19 |
+
|
20 |
+
Make sure to update your transformers installation via `pip install --upgrade transformers`.
|
21 |
+
|
22 |
+
```python
|
23 |
+
import torch
|
24 |
+
from transformers import pipeline
|
25 |
+
|
26 |
+
model_id = "prithivMLmods/Llama-Thinker-3B-Preview"
|
27 |
+
pipe = pipeline(
|
28 |
+
"text-generation",
|
29 |
+
model=model_id,
|
30 |
+
torch_dtype=torch.bfloat16,
|
31 |
+
device_map="auto",
|
32 |
+
)
|
33 |
+
messages = [
|
34 |
+
{"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
|
35 |
+
{"role": "user", "content": "Who are you?"},
|
36 |
+
]
|
37 |
+
outputs = pipe(
|
38 |
+
messages,
|
39 |
+
max_new_tokens=256,
|
40 |
+
)
|
41 |
+
print(outputs[0]["generated_text"][-1])
|
42 |
+
```
|
43 |
+
|
44 |
+
Note: You can also find detailed recipes on how to use the model locally, with `torch.compile()`, assisted generations, quantised and more at [`huggingface-llama-recipes`](https://github.com/huggingface/huggingface-llama-recipes)
|
45 |
+
|
46 |
+
# **Use with `llama`**
|
47 |
+
|
48 |
+
Please, follow the instructions in the [repository](https://github.com/meta-llama/llama)
|
49 |
+
|
50 |
+
To download Original checkpoints, see the example command below leveraging `huggingface-cli`:
|
51 |
+
|
52 |
+
```
|
53 |
+
huggingface-cli download prithivMLmods/Llama-Thinker-3B-Preview --include "original/*" --local-dir Llama-Thinker-3B-Preview
|
54 |
+
```
|