sigmareaver
/

flan-ul2-4bit-128g-gptq

text2text-generation

text-generation-inference

Model card Files Files and versions

sigmareaver commited on Jun 25, 2023

Commit

bea966a

·

1 Parent(s): 3d290ac

Update README.md

Files changed (1) hide show

README.md +38 -0

README.md CHANGED Viewed

@@ -1,3 +1,41 @@
 ---
 license: apache-2.0
 ---

+# flan-ul2 4-bit 128-groupsize GPTQ
+Quantized using qwopqwop200's GPTQ-for-Llama repo on the t5 branch.
+Quantization command:
+`PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 python t5.py ../full-models/flan-ul2 wikitext2 --nsamples 256 --wbits 4 --act-order --groupsize 128 --save ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq.pt`
+Benchmark command:
+`python t5.py ../full-models/flan-ul2 wikitext2 --load ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq2.pt --wbits 4 --groupsize 128 --benchmark --benchmark_mode mmlu`
+Results :
+```
+Average accuracy 0.289 - math
+Average accuracy 0.562 - health
+Average accuracy 0.416 - physics
+Average accuracy 0.780 - business
+Average accuracy 0.610 - biology
+Average accuracy 0.446 - chemistry
+Average accuracy 0.461 - computer science
+Average accuracy 0.513 - economics
+Average accuracy 0.538 - engineering
+Average accuracy 0.455 - philosophy
+Average accuracy 0.622 - other
+Average accuracy 0.703 - history
+Average accuracy 0.707 - geography
+Average accuracy 0.718 - politics
+Average accuracy 0.653 - psychology
+Average accuracy 0.711 - culture
+Average accuracy 0.447 - law
+Average accuracy 0.416 - STEM
+Average accuracy 0.501 - humanities
+Average accuracy 0.643 - social sciences
+Average accuracy 0.613 - other (business, health, misc.)
+MMLU Average accuracy: 0.540
+```
 ---
 license: apache-2.0
 ---