fbaldassarri commited on
Commit
507642a
1 Parent(s): ce7a6b2

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -11
README.md CHANGED
@@ -10,6 +10,7 @@ tags:
10
  - intel-autoround
11
  - awq
12
  - autoawq
 
13
  - woq
14
  license: apache-2.0
15
  model_name: Pythia 12b
@@ -24,8 +25,6 @@ prompt_template: '{prompt}
24
  quantized_by: fbaldassarri
25
  ---
26
 
27
-
28
-
29
  ## Model Information
30
 
31
  Quantized version of [EleutherAI/pythia-12b](https://huggingface.co/EleutherAI/pythia-12b) using torch.float32 for quantization tuning.
@@ -34,7 +33,7 @@ Quantized version of [EleutherAI/pythia-12b](https://huggingface.co/EleutherAI/p
34
  - Asymmetrical Quantization
35
  - Method AutoAWQ
36
 
37
- Quantization framework: [Intel AutoRound](https://github.com/intel/auto-round)
38
 
39
  Note: this INT4 version of pythia-12b has been quantized to run inference through CPU.
40
 
@@ -48,13 +47,14 @@ I suggest to install requirements into a dedicated python-virtualenv or a conda
48
  python -m pip install <package> --upgrade
49
  ```
50
 
51
- - accelerate==1.0.1
 
52
  - auto_gptq==0.7.1
53
- - neural_compressor==3.1
54
- - torch==2.3.0+cpu
55
- - torchaudio==2.5.0+cpu
56
- - torchvision==0.18.0+cpu
57
- - transformers==4.45.2
58
 
59
  ### Step 2 Build Intel Autoround wheel from sources
60
 
@@ -70,8 +70,8 @@ python -m pip install git+https://github.com/intel/auto-round.git
70
  model = AutoModelForCausalLM.from_pretrained(model_name)
71
  tokenizer = AutoTokenizer.from_pretrained(model_name)
72
  from auto_round import AutoRound
73
- bits, group_size, sym = 4, 128, False
74
- autoround = AutoRound(model, tokenizer, nsamples=128, iters=200, seqlen=512, batch_size=4, bits=bits, group_size=group_size, sym=sym)
75
  autoround.quantize()
76
  output_dir = "./AutoRound/EleutherAI_pythia-12b-autoawq-int4-gs128-asym"
77
  autoround.save_quantized(output_dir, format='auto_awq', inplace=True)
 
10
  - intel-autoround
11
  - awq
12
  - autoawq
13
+ - auto-awq
14
  - woq
15
  license: apache-2.0
16
  model_name: Pythia 12b
 
25
  quantized_by: fbaldassarri
26
  ---
27
 
 
 
28
  ## Model Information
29
 
30
  Quantized version of [EleutherAI/pythia-12b](https://huggingface.co/EleutherAI/pythia-12b) using torch.float32 for quantization tuning.
 
33
  - Asymmetrical Quantization
34
  - Method AutoAWQ
35
 
36
+ Quantization framework: [Intel AutoRound](https://github.com/intel/auto-round) v0.4.2
37
 
38
  Note: this INT4 version of pythia-12b has been quantized to run inference through CPU.
39
 
 
47
  python -m pip install <package> --upgrade
48
  ```
49
 
50
+ - accelerate==1.2.0
51
+ - autoawq==0.2.7.post3
52
  - auto_gptq==0.7.1
53
+ - neural_compressor==3.1.1
54
+ - torch==2.4.1+cpu
55
+ - torchaudio==2.4.1+cpu
56
+ - torchvision==0.19.1+cpu
57
+ - transformers==4.47.0
58
 
59
  ### Step 2 Build Intel Autoround wheel from sources
60
 
 
70
  model = AutoModelForCausalLM.from_pretrained(model_name)
71
  tokenizer = AutoTokenizer.from_pretrained(model_name)
72
  from auto_round import AutoRound
73
+ bits, group_size, sym, device, amp = 4, 128, False, 'cpu', False
74
+ autoround = AutoRound(model, tokenizer, nsamples=128, iters=200, seqlen=512, batch_size=4, bits=bits, group_size=group_size, sym=sym, device=device, amp=amp)
75
  autoround.quantize()
76
  output_dir = "./AutoRound/EleutherAI_pythia-12b-autoawq-int4-gs128-asym"
77
  autoround.save_quantized(output_dir, format='auto_awq', inplace=True)