File size: 1,584 Bytes
1f0de3a
 
 
 
 
 
 
 
 
 
 
 
bea966a
87f543c
 
bea966a
 
eb767e0
 
 
bea966a
eb767e0
 
 
bea966a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
language: 
  - en
  - fr
  - ro
  - de
  - multilingual
thumbnail: "url to a thumbnail used in social sharing"
license: apache-2.0
metrics:
- mmlu
---
# flan-ul2 4-bit 128-groupsize GPTQ
Quantized using qwopqwop200's GPTQ-for-Llama repo on the t5 branch.<br>
Original model can be found here: [Google/flan-ul2](https://huggingface.co/google/flan-ul2)

Quantization command:
```
PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 python t5.py ../full-models/flan-ul2 wikitext2 --nsamples 256 --wbits 4 --act-order --groupsize 128 --save ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq.pt
```
Benchmark command:
```
python t5.py ../full-models/flan-ul2 wikitext2 --load ../gptq-models/flan-ul2-gptq/flan-ul2-4bit-128g-gptq2.pt --wbits 4 --groupsize 128 --benchmark --benchmark_mode mmlu
```
Results : 
```
Average accuracy 0.289 - math
Average accuracy 0.562 - health
Average accuracy 0.416 - physics
Average accuracy 0.780 - business
Average accuracy 0.610 - biology
Average accuracy 0.446 - chemistry
Average accuracy 0.461 - computer science
Average accuracy 0.513 - economics
Average accuracy 0.538 - engineering
Average accuracy 0.455 - philosophy
Average accuracy 0.622 - other
Average accuracy 0.703 - history
Average accuracy 0.707 - geography
Average accuracy 0.718 - politics
Average accuracy 0.653 - psychology
Average accuracy 0.711 - culture
Average accuracy 0.447 - law
Average accuracy 0.416 - STEM
Average accuracy 0.501 - humanities
Average accuracy 0.643 - social sciences
Average accuracy 0.613 - other (business, health, misc.)
MMLU Average accuracy: 0.540
```