sigmareaver
/

flan-ul2-4bit-128g-gptq

text2text-generation

text-generation-inference

Model card Files Files and versions

sigmareaver commited on Jun 25, 2023

Commit

eb767e0

·

1 Parent(s): bea966a

Update README.md

Files changed (1) hide show

README.md +6 -2

README.md CHANGED Viewed

@@ -3,11 +3,15 @@ Quantized using qwopqwop200's GPTQ-for-Llama repo on the t5 branch.
 Quantization command:
-`PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 python t5.py ../full-models/flan-ul2 wikitext2 --nsamples 256 --wbits 4 --act-order --groupsize 128 --save ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq.pt`
 Benchmark command:
-`python t5.py ../full-models/flan-ul2 wikitext2 --load ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq2.pt --wbits 4 --groupsize 128 --benchmark --benchmark_mode mmlu`
 Results :
 ```

 Quantization command:
+```
+PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 python t5.py ../full-models/flan-ul2 wikitext2 --nsamples 256 --wbits 4 --act-order --groupsize 128 --save ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq.pt
+```
 Benchmark command:
+```
+python t5.py ../full-models/flan-ul2 wikitext2 --load ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq2.pt --wbits 4 --groupsize 128 --benchmark --benchmark_mode mmlu
+```
 Results :
 ```