Commit
·
bace60a
1
Parent(s):
9478eab
include notes about quantization process
Browse files
Readme.md
CHANGED
@@ -53,6 +53,42 @@ Framework versions
|
|
53 |
|
54 |
## Setup Notes
|
55 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
56 |
```bash
|
57 |
mkvirtualenv -p `which python3.11` -a . ${PWD##*/}
|
58 |
python -m pip install huggingface_hub
|
|
|
53 |
|
54 |
## Setup Notes
|
55 |
|
56 |
+
### Download torch model
|
57 |
+
|
58 |
+
This example demonstrates using `hfdownloader` to download a torch model from HF to `./storage`
|
59 |
+
|
60 |
+
```bash
|
61 |
+
./hfdownloader -m truehealth/LLama-2-MedText-13b
|
62 |
+
```
|
63 |
+
|
64 |
+
If necessary, install `hfdownloader` from https://github.com/bodaay/HuggingFaceModelDownloader
|
65 |
+
|
66 |
+
```bash
|
67 |
+
bash <(curl -sSL https://raw.githubusercontent.com/bodaay/HuggingFaceModelDownloader/master/scripts/gist_gethfd.sh) -h
|
68 |
+
```
|
69 |
+
|
70 |
+
### Quantize torch model with llama.cpp
|
71 |
+
|
72 |
+
Quantize directly to q8_0
|
73 |
+
|
74 |
+
```bash
|
75 |
+
llama.cpp/convert.py --outtype q8_0 --outfile LLama-2-MedText-13b-q8_0.gguf ./models/Storage/truehealth_LLama-2-MedText-13b/pytorch_model-00001-of-00003.bin
|
76 |
+
```
|
77 |
+
|
78 |
+
First convert to f32 GGUF
|
79 |
+
|
80 |
+
```bash
|
81 |
+
llama.cpp/convert.py --outtype f32 --outfile LLama-2-MedText-13b-f32.gguf ./models/Storage/truehealth_LLama-2-MedText-13b/pytorch_model-00001-of-00003.bin
|
82 |
+
```
|
83 |
+
|
84 |
+
Then quantize f32 GGUF to lower bit resolutions
|
85 |
+
|
86 |
+
```bash
|
87 |
+
llama.cpp/build/bin/quantize LLama-2-MedText-13b-f32.gguf LLama-2-MedText-13b-Q3_K_L.gguf Q3_K_L
|
88 |
+
```
|
89 |
+
|
90 |
+
### Distributing model through huggingface
|
91 |
+
|
92 |
```bash
|
93 |
mkvirtualenv -p `which python3.11` -a . ${PWD##*/}
|
94 |
python -m pip install huggingface_hub
|