hierholzer
commited on
Commit
•
44732ee
1
Parent(s):
3f8ce6e
Update README.md
Browse files
README.md
CHANGED
@@ -51,10 +51,10 @@ Models initially developed in frameworks like PyTorch can be converted to GGUF f
|
|
51 |
|
52 |
Here are the quantized versions that I have available:
|
53 |
|
54 |
-
- [
|
55 |
-
- [
|
56 |
-
- [
|
57 |
-
- [
|
58 |
- [x] Q4_K_S
|
59 |
- [x] Q4_K_M ~ *Recommended*
|
60 |
- [x] Q5_K_S ~ *Recommended*
|
@@ -68,7 +68,7 @@ Feel Free to reach out to me if you need a specific Quantization Type that I do
|
|
68 |
|
69 |
|
70 |
### 📈All Quantization Types Possible
|
71 |
-
Below is a table of all the
|
72 |
|
73 |
| **#** | **or** | **Q#** | **:** | _Description Of Quantization Types_ |
|
74 |
|-------|:------:|:------:|:-----:|----------------------------------------------------------------|
|
@@ -97,7 +97,6 @@ Below is a table of all the Quantication Types that are possible as well as shor
|
|
97 |
By using a GGUF version of Llama-3.3-70B-Instruct, you will be able to run this LLM while having to use significantly less resources than you would using the non quantized version.
|
98 |
This also allows you to run this 70B Model on a machine with less memory than a non quantized version.
|
99 |
|
100 |
-
|
101 |
## ⚙️️Installation
|
102 |
--------------------------------------------
|
103 |
Here are 2 different methods you can use to run the quantized versions of Llama-3.3-70B-Instruct
|
@@ -134,6 +133,7 @@ git clone https://github.com/oobabooga/text-generation-webui.git
|
|
134 |
Ollama runs as a local service.
|
135 |
Although it technically works using a command-line interface, Ollama's best attribute is their REST API.
|
136 |
Being able to utilize your locally ran LLMs through the use of this API can give you almost endless possibilities!
|
|
|
137 |
*Feel free to reach out to me if you would like to know some examples that I use this API for*
|
138 |
|
139 |
#### ☑️ How to install Ollama
|
@@ -143,16 +143,16 @@ https://ollama.com/download
|
|
143 |
```
|
144 |
Using Windows, or Mac you will then download a file and run it.
|
145 |
If you are using linux it will just provide a single command that you need to run in your terminal window.
|
146 |
-
*
|
147 |
#### ✅Using Llama-3.3-70B-Instruct-GGUF with Ollama
|
148 |
Ollama does have a Model Library where you can download models:
|
149 |
```shell
|
150 |
https://ollama.com/library
|
151 |
```
|
152 |
This Model Library offers many different LLM versions that you can use.
|
153 |
-
However at the time of writing this, there is no version of
|
154 |
|
155 |
-
If you would like to use Llama
|
156 |
|
157 |
| # | Running the 70B quantized version of Llama 3.3-Instruct with Ollama |
|
158 |
|----|----------------------------------------------------------------------------------------------|
|
@@ -165,7 +165,7 @@ ollama run hf.co/hierholzer/Llama-3.3-70B-Instruct-GGUF:Q4_K_M
|
|
165 |
*Replace Q4_K_M with whatever version you would like to use from this repository.*
|
166 |
| # | Running the 70B quantized version of Llama 3.3-Instruct with Ollama - *continued* |
|
167 |
|----|-----------------------------------------------------------------------------------|
|
168 |
-
| 3. | This will download & run the model. It will also be saved for
|
169 |
|
170 |
-------------------------------------------------
|
171 |
|
|
|
51 |
|
52 |
Here are the quantized versions that I have available:
|
53 |
|
54 |
+
- [x] Q2_K
|
55 |
+
- [x] Q3_K_S
|
56 |
+
- [x] Q3_K_M
|
57 |
+
- [x] Q3_K_L
|
58 |
- [x] Q4_K_S
|
59 |
- [x] Q4_K_M ~ *Recommended*
|
60 |
- [x] Q5_K_S ~ *Recommended*
|
|
|
68 |
|
69 |
|
70 |
### 📈All Quantization Types Possible
|
71 |
+
Below is a table of all the Quantization Types that are possible as well as short descriptions.
|
72 |
|
73 |
| **#** | **or** | **Q#** | **:** | _Description Of Quantization Types_ |
|
74 |
|-------|:------:|:------:|:-----:|----------------------------------------------------------------|
|
|
|
97 |
By using a GGUF version of Llama-3.3-70B-Instruct, you will be able to run this LLM while having to use significantly less resources than you would using the non quantized version.
|
98 |
This also allows you to run this 70B Model on a machine with less memory than a non quantized version.
|
99 |
|
|
|
100 |
## ⚙️️Installation
|
101 |
--------------------------------------------
|
102 |
Here are 2 different methods you can use to run the quantized versions of Llama-3.3-70B-Instruct
|
|
|
133 |
Ollama runs as a local service.
|
134 |
Although it technically works using a command-line interface, Ollama's best attribute is their REST API.
|
135 |
Being able to utilize your locally ran LLMs through the use of this API can give you almost endless possibilities!
|
136 |
+
|
137 |
*Feel free to reach out to me if you would like to know some examples that I use this API for*
|
138 |
|
139 |
#### ☑️ How to install Ollama
|
|
|
143 |
```
|
144 |
Using Windows, or Mac you will then download a file and run it.
|
145 |
If you are using linux it will just provide a single command that you need to run in your terminal window.
|
146 |
+
*That's about it for installing Ollama*
|
147 |
#### ✅Using Llama-3.3-70B-Instruct-GGUF with Ollama
|
148 |
Ollama does have a Model Library where you can download models:
|
149 |
```shell
|
150 |
https://ollama.com/library
|
151 |
```
|
152 |
This Model Library offers many different LLM versions that you can use.
|
153 |
+
However at the time of writing this, there is no version of Llama-3.3-Instruct offered in the Ollama library.
|
154 |
|
155 |
+
If you would like to use Llama-3.3-Instruct (70B), do the following:
|
156 |
|
157 |
| # | Running the 70B quantized version of Llama 3.3-Instruct with Ollama |
|
158 |
|----|----------------------------------------------------------------------------------------------|
|
|
|
165 |
*Replace Q4_K_M with whatever version you would like to use from this repository.*
|
166 |
| # | Running the 70B quantized version of Llama 3.3-Instruct with Ollama - *continued* |
|
167 |
|----|-----------------------------------------------------------------------------------|
|
168 |
+
| 3. | This will download & run the model. It will also be saved for future use. |
|
169 |
|
170 |
-------------------------------------------------
|
171 |
|