Commit
·
eb767e0
1
Parent(s):
bea966a
Update README.md
Browse files
README.md
CHANGED
|
@@ -3,11 +3,15 @@ Quantized using qwopqwop200's GPTQ-for-Llama repo on the t5 branch.
|
|
| 3 |
|
| 4 |
Quantization command:
|
| 5 |
|
| 6 |
-
|
|
|
|
|
|
|
| 7 |
|
| 8 |
Benchmark command:
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
| 11 |
|
| 12 |
Results :
|
| 13 |
```
|
|
|
|
| 3 |
|
| 4 |
Quantization command:
|
| 5 |
|
| 6 |
+
```
|
| 7 |
+
PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 python t5.py ../full-models/flan-ul2 wikitext2 --nsamples 256 --wbits 4 --act-order --groupsize 128 --save ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq.pt
|
| 8 |
+
```
|
| 9 |
|
| 10 |
Benchmark command:
|
| 11 |
|
| 12 |
+
```
|
| 13 |
+
python t5.py ../full-models/flan-ul2 wikitext2 --load ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq2.pt --wbits 4 --groupsize 128 --benchmark --benchmark_mode mmlu
|
| 14 |
+
```
|
| 15 |
|
| 16 |
Results :
|
| 17 |
```
|