prithivMLmods
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -52,3 +52,96 @@ To download Original checkpoints, see the example command below leveraging `hugg
|
|
52 |
```
|
53 |
huggingface-cli download prithivMLmods/Llama-Thinker-3B-Preview --include "original/*" --local-dir Llama-Thinker-3B-Preview
|
54 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
52 |
```
|
53 |
huggingface-cli download prithivMLmods/Llama-Thinker-3B-Preview --include "original/*" --local-dir Llama-Thinker-3B-Preview
|
54 |
```
|
55 |
+
|
56 |
+
Hereβs a version tailored for the **Llama-Thinker-3B-Preview-GGUF** model:
|
57 |
+
|
58 |
+
---
|
59 |
+
|
60 |
+
# **How to Run Llama-Thinker-3B-Preview on Ollama Locally**
|
61 |
+
|
62 |
+
This guide demonstrates how to run the **Llama-Thinker-3B-Preview-GGUF** model locally using Ollama. The model is instruction-tuned for multilingual tasks and complex reasoning, making it highly versatile for a wide range of use cases. By the end, you'll be equipped to run this and other open-source models with ease.
|
63 |
+
|
64 |
+
---
|
65 |
+
|
66 |
+
## Example 1: How to Run the Llama-Thinker-3B-Preview Model
|
67 |
+
|
68 |
+
The **Llama-Thinker-3B** model is a pretrained and instruction-tuned LLM, designed for complex reasoning tasks across multiple languages. In this guide, we'll interact with it locally using Ollama, with support for quantized models.
|
69 |
+
|
70 |
+
### Step 1: Download the Model
|
71 |
+
|
72 |
+
First, download the **Llama-Thinker-3B-Preview-GGUF** model using the following command:
|
73 |
+
|
74 |
+
```bash
|
75 |
+
ollama run llama-thinker-3b-preview.gguf
|
76 |
+
```
|
77 |
+
|
78 |
+
### Step 2: Model Initialization and Download
|
79 |
+
|
80 |
+
Once the command is executed, Ollama will initialize and download the necessary model files. You should see output similar to this:
|
81 |
+
|
82 |
+
```plaintext
|
83 |
+
pulling manifest
|
84 |
+
pulling a12cd3456efg... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 3.2 GB
|
85 |
+
pulling 9f87ghijklmn... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 6.5 KB
|
86 |
+
verifying sha256 digest
|
87 |
+
writing manifest
|
88 |
+
removing any unused layers
|
89 |
+
success
|
90 |
+
>>> Send a message (/? for help)
|
91 |
+
```
|
92 |
+
|
93 |
+
### Step 3: Interact with the Model
|
94 |
+
|
95 |
+
Once the model is fully loaded, you can interact with it by sending prompts. For example, let's ask:
|
96 |
+
|
97 |
+
```plaintext
|
98 |
+
>>> How can you assist me today?
|
99 |
+
```
|
100 |
+
|
101 |
+
A sample response might look like this [may / maynot be identical]:
|
102 |
+
|
103 |
+
```plaintext
|
104 |
+
I am Llama-Thinker-3B, an advanced AI language model designed to assist with complex reasoning, multilingual tasks, and general-purpose queries. Here are a few things I can help you with:
|
105 |
+
|
106 |
+
1. Answering complex questions in multiple languages.
|
107 |
+
2. Assisting with creative writing, content generation, and problem-solving.
|
108 |
+
3. Providing detailed summaries and explanations.
|
109 |
+
4. Translating text across different languages.
|
110 |
+
5. Generating ideas for personal or professional use.
|
111 |
+
6. Offering insights on technical topics.
|
112 |
+
|
113 |
+
Feel free to ask me anything you'd like assistance with!
|
114 |
+
```
|
115 |
+
|
116 |
+
### Step 4: Exit the Program
|
117 |
+
|
118 |
+
To exit the program, simply type:
|
119 |
+
|
120 |
+
```plaintext
|
121 |
+
/exit
|
122 |
+
```
|
123 |
+
|
124 |
+
---
|
125 |
+
|
126 |
+
## Example 2: Using Multi-Modal Models (Future Use)
|
127 |
+
|
128 |
+
In the future, Ollama may support multi-modal models where you can input both text and images for advanced interactions. This section will be updated as new capabilities become available.
|
129 |
+
|
130 |
+
---
|
131 |
+
|
132 |
+
## Notes on Using Quantized Models
|
133 |
+
|
134 |
+
Quantized models like **llama-thinker-3b-preview.gguf** are optimized for efficient performance on local systems with limited resources. Here are some key points to ensure smooth operation:
|
135 |
+
|
136 |
+
1. **VRAM/CPU Requirements**: Ensure your system has adequate VRAM or CPU resources to handle model inference.
|
137 |
+
2. **Model Format**: Use the `.gguf` model format for compatibility with Ollama.
|
138 |
+
|
139 |
+
---
|
140 |
+
|
141 |
+
# **Conclusion**
|
142 |
+
|
143 |
+
Running the **Llama-Thinker-3B-Preview** model locally using Ollama provides a powerful way to leverage open-source LLMs for complex reasoning and multilingual tasks. By following this guide, you can explore other models and expand your use cases as new models become available.
|
144 |
+
|
145 |
+
---
|
146 |
+
|
147 |
+
Let me know if youβd like further customizations!
|