Update README.md
Browse files
README.md
CHANGED
@@ -30,6 +30,7 @@ fine-tune the model for safe performance in downstream applications._"
|
|
30 |
|
31 |
Right now, the following quantizations are available:
|
32 |
|
|
|
33 |
* [stablelm-3b-4e1t-Q4_K_M](https://huggingface.co/rozek/StableLM-3B-4E1T_GGUF/blob/main/stablelm-3b-4e1t-Q4_K_M.bin)
|
34 |
* [stablelm-3b-4e1t-Q5_K_M](https://huggingface.co/rozek/StableLM-3B-4E1T_GGUF/blob/main/stablelm-3b-4e1t-Q5_K_M.bin)
|
35 |
* [stablelm-3b-4e1t-Q6_K](https://huggingface.co/rozek/StableLM-3B-4E1T_GGUF/blob/main/stablelm-3b-4e1t-Q6_K.bin)
|
@@ -75,6 +76,54 @@ Text embeddings are calculated with a command similar to
|
|
75 |
./embedding -m stablelm-3b-4e1t-Q8_0.bin --prompt "who was Joseph Weizenbaum?"
|
76 |
```
|
77 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
78 |
|
79 |
## License ##
|
80 |
|
|
|
30 |
|
31 |
Right now, the following quantizations are available:
|
32 |
|
33 |
+
* [stablelm-3b-4e1t-Q3_K_M](https://huggingface.co/rozek/StableLM-3B-4E1T_GGUF/blob/main/stablelm-3b-4e1t-Q3_K_M.bin)
|
34 |
* [stablelm-3b-4e1t-Q4_K_M](https://huggingface.co/rozek/StableLM-3B-4E1T_GGUF/blob/main/stablelm-3b-4e1t-Q4_K_M.bin)
|
35 |
* [stablelm-3b-4e1t-Q5_K_M](https://huggingface.co/rozek/StableLM-3B-4E1T_GGUF/blob/main/stablelm-3b-4e1t-Q5_K_M.bin)
|
36 |
* [stablelm-3b-4e1t-Q6_K](https://huggingface.co/rozek/StableLM-3B-4E1T_GGUF/blob/main/stablelm-3b-4e1t-Q6_K.bin)
|
|
|
76 |
./embedding -m stablelm-3b-4e1t-Q8_0.bin --prompt "who was Joseph Weizenbaum?"
|
77 |
```
|
78 |
|
79 |
+
## Conversion Details ##
|
80 |
+
|
81 |
+
Conversion was done using a Docker container based on
|
82 |
+
`python:3.10.13-slim-bookworm`
|
83 |
+
|
84 |
+
After downloading the original model files into a separate directory, the
|
85 |
+
container was started with
|
86 |
+
|
87 |
+
```
|
88 |
+
docker run --interactive \
|
89 |
+
--mount type=bind,src=<local-folder>,dst=/llm \
|
90 |
+
python:3.10.13-slim-bookworm
|
91 |
+
```
|
92 |
+
|
93 |
+
where `<local-folder>` was the path to the folder containing the downloaded
|
94 |
+
model.
|
95 |
+
|
96 |
+
With the container's terminal, the following commands were issued:
|
97 |
+
|
98 |
+
```
|
99 |
+
apt-get update
|
100 |
+
apt-get install build-essential git -y
|
101 |
+
|
102 |
+
git clone https://github.com/ggerganov/llama.cpp
|
103 |
+
cd llama.cpp
|
104 |
+
|
105 |
+
## Important: uncomment the make command that fits to your host computer!
|
106 |
+
## on Apple Silicon machines: (see https://github.com/ggerganov/llama.cpp/issues/1655)
|
107 |
+
# UNAME_M=arm64 UNAME_p=arm LLAMA_NO_METAL=1 make
|
108 |
+
## otherwise
|
109 |
+
# make
|
110 |
+
|
111 |
+
python3 -m pip install -r requirements.txt
|
112 |
+
pip install torch transformers
|
113 |
+
|
114 |
+
# see https://github.com/ggerganov/llama.cpp/issues/3344
|
115 |
+
python3 convert-hf-to-gguf.py /llm
|
116 |
+
mv /llm/ggml-model-f16.gguf /llm/stablelm-3b-4e1t.gguf
|
117 |
+
|
118 |
+
# the following command is just an example, modify it as needed
|
119 |
+
./quantize /llm/stablelm-3b-4e1t.gguf /llm/stablelm-3b-4e1t_Q3_K_M.gguf q3_k_m
|
120 |
+
```
|
121 |
+
|
122 |
+
After conversion, the mounted folder (the one that originally contained the
|
123 |
+
model only) now also contains all conversions.
|
124 |
+
|
125 |
+
The container itself may now be safely deleted - the conversions will remain on
|
126 |
+
disk.
|
127 |
|
128 |
## License ##
|
129 |
|