legraphista
commited on
Commit
•
3048fc1
1
Parent(s):
113fd1c
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -22,6 +22,20 @@ Original dtype: `BF16` (`bfloat16`)
|
|
22 |
Quantized by: llama.cpp [https://github.com/ggerganov/llama.cpp/pull/7519](https://github.com/ggerganov/llama.cpp/releases/tag/https://github.com/ggerganov/llama.cpp/pull/7519)
|
23 |
IMatrix dataset: [here](https://gist.githubusercontent.com/legraphista/d6d93f1a254bcfc58e0af3777eaec41e/raw/d380e7002cea4a51c33fffd47db851942754e7cc/imatrix.calibration.medium.raw)
|
24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
## Files
|
26 |
|
27 |
### IMatrix
|
@@ -64,20 +78,31 @@ Link: [here](https://huggingface.co/legraphista/DeepSeek-V2-Lite-IMat-GGUF/blob/
|
|
64 |
|
65 |
|
66 |
## Downloading using huggingface-cli
|
67 |
-
|
68 |
```
|
69 |
pip install -U "huggingface_hub[cli]"
|
70 |
```
|
71 |
-
|
72 |
```
|
73 |
huggingface-cli download legraphista/DeepSeek-V2-Lite-IMat-GGUF --include "DeepSeek-V2-Lite.Q8_0.gguf" --local-dir ./
|
74 |
```
|
75 |
-
If the model is
|
76 |
```
|
77 |
huggingface-cli download legraphista/DeepSeek-V2-Lite-IMat-GGUF --include "DeepSeek-V2-Lite.Q8_0/*" --local-dir DeepSeek-V2-Lite.Q8_0
|
78 |
# see FAQ for merging GGUF's
|
79 |
```
|
80 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
81 |
## FAQ
|
82 |
|
83 |
### Why is the IMatrix not applied everywhere?
|
|
|
22 |
Quantized by: llama.cpp [https://github.com/ggerganov/llama.cpp/pull/7519](https://github.com/ggerganov/llama.cpp/releases/tag/https://github.com/ggerganov/llama.cpp/pull/7519)
|
23 |
IMatrix dataset: [here](https://gist.githubusercontent.com/legraphista/d6d93f1a254bcfc58e0af3777eaec41e/raw/d380e7002cea4a51c33fffd47db851942754e7cc/imatrix.calibration.medium.raw)
|
24 |
|
25 |
+
- [DeepSeek-V2-Lite-IMat-GGUF](#deepseek-v2-lite-imat-gguf)
|
26 |
+
- [Files](#files)
|
27 |
+
- [IMatrix](#imatrix)
|
28 |
+
- [Common Quants](#common-quants)
|
29 |
+
- [All Quants](#all-quants)
|
30 |
+
- [Downloading using huggingface-cli](#downloading-using-huggingface-cli)
|
31 |
+
- [Inference](#inference)
|
32 |
+
- [Llama.cpp](#llama-cpp)
|
33 |
+
- [FAQ](#faq)
|
34 |
+
- [Why is the IMatrix not applied everywhere?](#why-is-the-imatrix-not-applied-everywhere)
|
35 |
+
- [How do I merge a split GGUF?](#how-do-i-merge-a-split-gguf)
|
36 |
+
|
37 |
+
---
|
38 |
+
|
39 |
## Files
|
40 |
|
41 |
### IMatrix
|
|
|
78 |
|
79 |
|
80 |
## Downloading using huggingface-cli
|
81 |
+
If you do not have hugginface-cli installed:
|
82 |
```
|
83 |
pip install -U "huggingface_hub[cli]"
|
84 |
```
|
85 |
+
Download the specific file you want:
|
86 |
```
|
87 |
huggingface-cli download legraphista/DeepSeek-V2-Lite-IMat-GGUF --include "DeepSeek-V2-Lite.Q8_0.gguf" --local-dir ./
|
88 |
```
|
89 |
+
If the model file is big, it has been split into multiple files. In order to download them all to a local folder, run:
|
90 |
```
|
91 |
huggingface-cli download legraphista/DeepSeek-V2-Lite-IMat-GGUF --include "DeepSeek-V2-Lite.Q8_0/*" --local-dir DeepSeek-V2-Lite.Q8_0
|
92 |
# see FAQ for merging GGUF's
|
93 |
```
|
94 |
|
95 |
+
---
|
96 |
+
|
97 |
+
## Inference
|
98 |
+
|
99 |
+
### Llama.cpp
|
100 |
+
```
|
101 |
+
llama.cpp/main -m DeepSeek-V2-Lite.Q8_0.gguf --color -i -p "prompt here"
|
102 |
+
```
|
103 |
+
|
104 |
+
---
|
105 |
+
|
106 |
## FAQ
|
107 |
|
108 |
### Why is the IMatrix not applied everywhere?
|