prithivMLmods commited on
Commit
a7b16a0
Β·
verified Β·
1 Parent(s): 0311e7b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +93 -0
README.md CHANGED
@@ -52,3 +52,96 @@ To download Original checkpoints, see the example command below leveraging `hugg
52
  ```
53
  huggingface-cli download prithivMLmods/Llama-Thinker-3B-Preview --include "original/*" --local-dir Llama-Thinker-3B-Preview
54
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
  ```
53
  huggingface-cli download prithivMLmods/Llama-Thinker-3B-Preview --include "original/*" --local-dir Llama-Thinker-3B-Preview
54
  ```
55
+
56
+ Here’s a version tailored for the **Llama-Thinker-3B-Preview-GGUF** model:
57
+
58
+ ---
59
+
60
+ # **How to Run Llama-Thinker-3B-Preview on Ollama Locally**
61
+
62
+ This guide demonstrates how to run the **Llama-Thinker-3B-Preview-GGUF** model locally using Ollama. The model is instruction-tuned for multilingual tasks and complex reasoning, making it highly versatile for a wide range of use cases. By the end, you'll be equipped to run this and other open-source models with ease.
63
+
64
+ ---
65
+
66
+ ## Example 1: How to Run the Llama-Thinker-3B-Preview Model
67
+
68
+ The **Llama-Thinker-3B** model is a pretrained and instruction-tuned LLM, designed for complex reasoning tasks across multiple languages. In this guide, we'll interact with it locally using Ollama, with support for quantized models.
69
+
70
+ ### Step 1: Download the Model
71
+
72
+ First, download the **Llama-Thinker-3B-Preview-GGUF** model using the following command:
73
+
74
+ ```bash
75
+ ollama run llama-thinker-3b-preview.gguf
76
+ ```
77
+
78
+ ### Step 2: Model Initialization and Download
79
+
80
+ Once the command is executed, Ollama will initialize and download the necessary model files. You should see output similar to this:
81
+
82
+ ```plaintext
83
+ pulling manifest
84
+ pulling a12cd3456efg... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– 3.2 GB
85
+ pulling 9f87ghijklmn... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– 6.5 KB
86
+ verifying sha256 digest
87
+ writing manifest
88
+ removing any unused layers
89
+ success
90
+ >>> Send a message (/? for help)
91
+ ```
92
+
93
+ ### Step 3: Interact with the Model
94
+
95
+ Once the model is fully loaded, you can interact with it by sending prompts. For example, let's ask:
96
+
97
+ ```plaintext
98
+ >>> How can you assist me today?
99
+ ```
100
+
101
+ A sample response might look like this [may / maynot be identical]:
102
+
103
+ ```plaintext
104
+ I am Llama-Thinker-3B, an advanced AI language model designed to assist with complex reasoning, multilingual tasks, and general-purpose queries. Here are a few things I can help you with:
105
+
106
+ 1. Answering complex questions in multiple languages.
107
+ 2. Assisting with creative writing, content generation, and problem-solving.
108
+ 3. Providing detailed summaries and explanations.
109
+ 4. Translating text across different languages.
110
+ 5. Generating ideas for personal or professional use.
111
+ 6. Offering insights on technical topics.
112
+
113
+ Feel free to ask me anything you'd like assistance with!
114
+ ```
115
+
116
+ ### Step 4: Exit the Program
117
+
118
+ To exit the program, simply type:
119
+
120
+ ```plaintext
121
+ /exit
122
+ ```
123
+
124
+ ---
125
+
126
+ ## Example 2: Using Multi-Modal Models (Future Use)
127
+
128
+ In the future, Ollama may support multi-modal models where you can input both text and images for advanced interactions. This section will be updated as new capabilities become available.
129
+
130
+ ---
131
+
132
+ ## Notes on Using Quantized Models
133
+
134
+ Quantized models like **llama-thinker-3b-preview.gguf** are optimized for efficient performance on local systems with limited resources. Here are some key points to ensure smooth operation:
135
+
136
+ 1. **VRAM/CPU Requirements**: Ensure your system has adequate VRAM or CPU resources to handle model inference.
137
+ 2. **Model Format**: Use the `.gguf` model format for compatibility with Ollama.
138
+
139
+ ---
140
+
141
+ # **Conclusion**
142
+
143
+ Running the **Llama-Thinker-3B-Preview** model locally using Ollama provides a powerful way to leverage open-source LLMs for complex reasoning and multilingual tasks. By following this guide, you can explore other models and expand your use cases as new models become available.
144
+
145
+ ---
146
+
147
+ Let me know if you’d like further customizations!