aikitoria commited on
Commit
b6f4770
·
verified ·
1 Parent(s): 6e790da

Upload folder using huggingface_hub

Browse files
config.json ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "producer": {
3
+ "name": "modelopt",
4
+ "version": "0.19.0"
5
+ },
6
+ "architecture": "LlamaForCausalLM",
7
+ "dtype": "bfloat16",
8
+ "logits_dtype": "float32",
9
+ "num_hidden_layers": 88,
10
+ "num_attention_heads": 96,
11
+ "num_key_value_heads": 8,
12
+ "hidden_size": 12288,
13
+ "norm_epsilon": 1e-05,
14
+ "vocab_size": 32768,
15
+ "max_position_embeddings": 131072,
16
+ "hidden_act": "silu",
17
+ "use_parallel_embedding": true,
18
+ "embedding_sharding_dim": 0,
19
+ "quantization": {
20
+ "quant_algo": "MIXED_PRECISION",
21
+ "kv_cache_quant_algo": "FP8"
22
+ },
23
+ "mapping": {
24
+ "world_size": 4,
25
+ "tp_size": 4,
26
+ "pp_size": 1
27
+ },
28
+ "head_size": 128,
29
+ "intermediate_size": 28672,
30
+ "position_embedding_type": "rope_gpt_neox",
31
+ "share_embedding_table": false,
32
+ "residual_mlp": false,
33
+ "bias": false,
34
+ "rotary_pct": 1.0,
35
+ "rank": 3,
36
+ "decoder": "llama",
37
+ "rmsnorm": true,
38
+ "lm_head_bias": false,
39
+ "rotary_base": 1000000.0,
40
+ "model_type": "llama"
41
+ }
quant_cfg.json ADDED
@@ -0,0 +1,2340 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "quant_algo": "MIXED_PRECISION",
3
+ "kv_cache_quant_algo": "FP8",
4
+ "quantized_layers": {
5
+ "transformer.layers.0.attention.qkv": {
6
+ "quant_algo": "W4A8_AWQ",
7
+ "group_size": 128,
8
+ "has_zero_point": false,
9
+ "pre_quant_scale": true
10
+ },
11
+ "transformer.layers.0.attention.dense": {
12
+ "quant_algo": "W4A8_AWQ",
13
+ "group_size": 128,
14
+ "has_zero_point": false,
15
+ "pre_quant_scale": true
16
+ },
17
+ "transformer.layers.0.mlp.fc": {
18
+ "quant_algo": "W4A8_AWQ",
19
+ "group_size": 128,
20
+ "has_zero_point": false,
21
+ "pre_quant_scale": true
22
+ },
23
+ "transformer.layers.0.mlp.gate": {
24
+ "quant_algo": "W4A8_AWQ",
25
+ "group_size": 128,
26
+ "has_zero_point": false,
27
+ "pre_quant_scale": true
28
+ },
29
+ "transformer.layers.0.mlp.proj": {
30
+ "quant_algo": "W4A16_AWQ",
31
+ "group_size": 128,
32
+ "has_zero_point": false,
33
+ "pre_quant_scale": true
34
+ },
35
+ "transformer.layers.1.attention.qkv": {
36
+ "quant_algo": "W4A16_AWQ",
37
+ "group_size": 128,
38
+ "has_zero_point": false,
39
+ "pre_quant_scale": true
40
+ },
41
+ "transformer.layers.1.attention.dense": {
42
+ "quant_algo": "W4A8_AWQ",
43
+ "group_size": 128,
44
+ "has_zero_point": false,
45
+ "pre_quant_scale": true
46
+ },
47
+ "transformer.layers.1.mlp.fc": {
48
+ "quant_algo": "W4A8_AWQ",
49
+ "group_size": 128,
50
+ "has_zero_point": false,
51
+ "pre_quant_scale": true
52
+ },
53
+ "transformer.layers.1.mlp.gate": {
54
+ "quant_algo": "W4A8_AWQ",
55
+ "group_size": 128,
56
+ "has_zero_point": false,
57
+ "pre_quant_scale": true
58
+ },
59
+ "transformer.layers.1.mlp.proj": {
60
+ "quant_algo": "W4A8_AWQ",
61
+ "group_size": 128,
62
+ "has_zero_point": false,
63
+ "pre_quant_scale": true
64
+ },
65
+ "transformer.layers.2.attention.qkv": {
66
+ "quant_algo": "W4A8_AWQ",
67
+ "group_size": 128,
68
+ "has_zero_point": false,
69
+ "pre_quant_scale": true
70
+ },
71
+ "transformer.layers.2.attention.dense": {
72
+ "quant_algo": "W4A16_AWQ",
73
+ "group_size": 128,
74
+ "has_zero_point": false,
75
+ "pre_quant_scale": true
76
+ },
77
+ "transformer.layers.2.mlp.fc": {
78
+ "quant_algo": "FP8"
79
+ },
80
+ "transformer.layers.2.mlp.gate": {
81
+ "quant_algo": "FP8"
82
+ },
83
+ "transformer.layers.2.mlp.proj": {
84
+ "quant_algo": "W4A8_AWQ",
85
+ "group_size": 128,
86
+ "has_zero_point": false,
87
+ "pre_quant_scale": true
88
+ },
89
+ "transformer.layers.3.attention.qkv": {
90
+ "quant_algo": "W4A8_AWQ",
91
+ "group_size": 128,
92
+ "has_zero_point": false,
93
+ "pre_quant_scale": true
94
+ },
95
+ "transformer.layers.3.attention.dense": {
96
+ "quant_algo": "FP8"
97
+ },
98
+ "transformer.layers.3.mlp.fc": {
99
+ "quant_algo": "W4A8_AWQ",
100
+ "group_size": 128,
101
+ "has_zero_point": false,
102
+ "pre_quant_scale": true
103
+ },
104
+ "transformer.layers.3.mlp.gate": {
105
+ "quant_algo": "W4A8_AWQ",
106
+ "group_size": 128,
107
+ "has_zero_point": false,
108
+ "pre_quant_scale": true
109
+ },
110
+ "transformer.layers.3.mlp.proj": {
111
+ "quant_algo": "W4A16_AWQ",
112
+ "group_size": 128,
113
+ "has_zero_point": false,
114
+ "pre_quant_scale": true
115
+ },
116
+ "transformer.layers.4.attention.qkv": {
117
+ "quant_algo": "W4A8_AWQ",
118
+ "group_size": 128,
119
+ "has_zero_point": false,
120
+ "pre_quant_scale": true
121
+ },
122
+ "transformer.layers.4.attention.dense": {
123
+ "quant_algo": "FP8"
124
+ },
125
+ "transformer.layers.4.mlp.fc": {
126
+ "quant_algo": "W4A8_AWQ",
127
+ "group_size": 128,
128
+ "has_zero_point": false,
129
+ "pre_quant_scale": true
130
+ },
131
+ "transformer.layers.4.mlp.gate": {
132
+ "quant_algo": "W4A8_AWQ",
133
+ "group_size": 128,
134
+ "has_zero_point": false,
135
+ "pre_quant_scale": true
136
+ },
137
+ "transformer.layers.4.mlp.proj": {
138
+ "quant_algo": "W4A16_AWQ",
139
+ "group_size": 128,
140
+ "has_zero_point": false,
141
+ "pre_quant_scale": true
142
+ },
143
+ "transformer.layers.5.attention.qkv": {
144
+ "quant_algo": "W4A8_AWQ",
145
+ "group_size": 128,
146
+ "has_zero_point": false,
147
+ "pre_quant_scale": true
148
+ },
149
+ "transformer.layers.5.attention.dense": {
150
+ "quant_algo": "FP8"
151
+ },
152
+ "transformer.layers.5.mlp.fc": {
153
+ "quant_algo": "W4A16_AWQ",
154
+ "group_size": 128,
155
+ "has_zero_point": false,
156
+ "pre_quant_scale": true
157
+ },
158
+ "transformer.layers.5.mlp.gate": {
159
+ "quant_algo": "W4A16_AWQ",
160
+ "group_size": 128,
161
+ "has_zero_point": false,
162
+ "pre_quant_scale": true
163
+ },
164
+ "transformer.layers.5.mlp.proj": {
165
+ "quant_algo": "W4A8_AWQ",
166
+ "group_size": 128,
167
+ "has_zero_point": false,
168
+ "pre_quant_scale": true
169
+ },
170
+ "transformer.layers.6.attention.qkv": {
171
+ "quant_algo": "W4A16_AWQ",
172
+ "group_size": 128,
173
+ "has_zero_point": false,
174
+ "pre_quant_scale": true
175
+ },
176
+ "transformer.layers.6.attention.dense": {
177
+ "quant_algo": "W4A8_AWQ",
178
+ "group_size": 128,
179
+ "has_zero_point": false,
180
+ "pre_quant_scale": true
181
+ },
182
+ "transformer.layers.6.mlp.fc": {
183
+ "quant_algo": "W4A8_AWQ",
184
+ "group_size": 128,
185
+ "has_zero_point": false,
186
+ "pre_quant_scale": true
187
+ },
188
+ "transformer.layers.6.mlp.gate": {
189
+ "quant_algo": "W4A8_AWQ",
190
+ "group_size": 128,
191
+ "has_zero_point": false,
192
+ "pre_quant_scale": true
193
+ },
194
+ "transformer.layers.6.mlp.proj": {
195
+ "quant_algo": "W4A8_AWQ",
196
+ "group_size": 128,
197
+ "has_zero_point": false,
198
+ "pre_quant_scale": true
199
+ },
200
+ "transformer.layers.7.attention.qkv": {
201
+ "quant_algo": "W4A8_AWQ",
202
+ "group_size": 128,
203
+ "has_zero_point": false,
204
+ "pre_quant_scale": true
205
+ },
206
+ "transformer.layers.7.attention.dense": {
207
+ "quant_algo": "W4A16_AWQ",
208
+ "group_size": 128,
209
+ "has_zero_point": false,
210
+ "pre_quant_scale": true
211
+ },
212
+ "transformer.layers.7.mlp.fc": {
213
+ "quant_algo": "W4A8_AWQ",
214
+ "group_size": 128,
215
+ "has_zero_point": false,
216
+ "pre_quant_scale": true
217
+ },
218
+ "transformer.layers.7.mlp.gate": {
219
+ "quant_algo": "W4A8_AWQ",
220
+ "group_size": 128,
221
+ "has_zero_point": false,
222
+ "pre_quant_scale": true
223
+ },
224
+ "transformer.layers.7.mlp.proj": {
225
+ "quant_algo": "W4A8_AWQ",
226
+ "group_size": 128,
227
+ "has_zero_point": false,
228
+ "pre_quant_scale": true
229
+ },
230
+ "transformer.layers.8.attention.qkv": {
231
+ "quant_algo": "W4A8_AWQ",
232
+ "group_size": 128,
233
+ "has_zero_point": false,
234
+ "pre_quant_scale": true
235
+ },
236
+ "transformer.layers.8.attention.dense": {
237
+ "quant_algo": "W4A16_AWQ",
238
+ "group_size": 128,
239
+ "has_zero_point": false,
240
+ "pre_quant_scale": true
241
+ },
242
+ "transformer.layers.8.mlp.fc": {
243
+ "quant_algo": "W4A16_AWQ",
244
+ "group_size": 128,
245
+ "has_zero_point": false,
246
+ "pre_quant_scale": true
247
+ },
248
+ "transformer.layers.8.mlp.gate": {
249
+ "quant_algo": "W4A16_AWQ",
250
+ "group_size": 128,
251
+ "has_zero_point": false,
252
+ "pre_quant_scale": true
253
+ },
254
+ "transformer.layers.8.mlp.proj": {
255
+ "quant_algo": "W4A8_AWQ",
256
+ "group_size": 128,
257
+ "has_zero_point": false,
258
+ "pre_quant_scale": true
259
+ },
260
+ "transformer.layers.9.attention.qkv": {
261
+ "quant_algo": "W4A16_AWQ",
262
+ "group_size": 128,
263
+ "has_zero_point": false,
264
+ "pre_quant_scale": true
265
+ },
266
+ "transformer.layers.9.attention.dense": {
267
+ "quant_algo": "W4A16_AWQ",
268
+ "group_size": 128,
269
+ "has_zero_point": false,
270
+ "pre_quant_scale": true
271
+ },
272
+ "transformer.layers.9.mlp.fc": {
273
+ "quant_algo": "W4A16_AWQ",
274
+ "group_size": 128,
275
+ "has_zero_point": false,
276
+ "pre_quant_scale": true
277
+ },
278
+ "transformer.layers.9.mlp.gate": {
279
+ "quant_algo": "W4A16_AWQ",
280
+ "group_size": 128,
281
+ "has_zero_point": false,
282
+ "pre_quant_scale": true
283
+ },
284
+ "transformer.layers.9.mlp.proj": {
285
+ "quant_algo": "W4A16_AWQ",
286
+ "group_size": 128,
287
+ "has_zero_point": false,
288
+ "pre_quant_scale": true
289
+ },
290
+ "transformer.layers.10.attention.qkv": {
291
+ "quant_algo": "W4A16_AWQ",
292
+ "group_size": 128,
293
+ "has_zero_point": false,
294
+ "pre_quant_scale": true
295
+ },
296
+ "transformer.layers.10.attention.dense": {
297
+ "quant_algo": "W4A16_AWQ",
298
+ "group_size": 128,
299
+ "has_zero_point": false,
300
+ "pre_quant_scale": true
301
+ },
302
+ "transformer.layers.10.mlp.fc": {
303
+ "quant_algo": "W4A8_AWQ",
304
+ "group_size": 128,
305
+ "has_zero_point": false,
306
+ "pre_quant_scale": true
307
+ },
308
+ "transformer.layers.10.mlp.gate": {
309
+ "quant_algo": "W4A8_AWQ",
310
+ "group_size": 128,
311
+ "has_zero_point": false,
312
+ "pre_quant_scale": true
313
+ },
314
+ "transformer.layers.10.mlp.proj": {
315
+ "quant_algo": "W4A16_AWQ",
316
+ "group_size": 128,
317
+ "has_zero_point": false,
318
+ "pre_quant_scale": true
319
+ },
320
+ "transformer.layers.11.attention.qkv": {
321
+ "quant_algo": "W4A16_AWQ",
322
+ "group_size": 128,
323
+ "has_zero_point": false,
324
+ "pre_quant_scale": true
325
+ },
326
+ "transformer.layers.11.attention.dense": {
327
+ "quant_algo": "W4A8_AWQ",
328
+ "group_size": 128,
329
+ "has_zero_point": false,
330
+ "pre_quant_scale": true
331
+ },
332
+ "transformer.layers.11.mlp.fc": {
333
+ "quant_algo": "W4A16_AWQ",
334
+ "group_size": 128,
335
+ "has_zero_point": false,
336
+ "pre_quant_scale": true
337
+ },
338
+ "transformer.layers.11.mlp.gate": {
339
+ "quant_algo": "W4A16_AWQ",
340
+ "group_size": 128,
341
+ "has_zero_point": false,
342
+ "pre_quant_scale": true
343
+ },
344
+ "transformer.layers.11.mlp.proj": {
345
+ "quant_algo": "W4A16_AWQ",
346
+ "group_size": 128,
347
+ "has_zero_point": false,
348
+ "pre_quant_scale": true
349
+ },
350
+ "transformer.layers.12.attention.qkv": {
351
+ "quant_algo": "W4A16_AWQ",
352
+ "group_size": 128,
353
+ "has_zero_point": false,
354
+ "pre_quant_scale": true
355
+ },
356
+ "transformer.layers.12.attention.dense": {
357
+ "quant_algo": "W4A8_AWQ",
358
+ "group_size": 128,
359
+ "has_zero_point": false,
360
+ "pre_quant_scale": true
361
+ },
362
+ "transformer.layers.12.mlp.fc": {
363
+ "quant_algo": "W4A16_AWQ",
364
+ "group_size": 128,
365
+ "has_zero_point": false,
366
+ "pre_quant_scale": true
367
+ },
368
+ "transformer.layers.12.mlp.gate": {
369
+ "quant_algo": "W4A16_AWQ",
370
+ "group_size": 128,
371
+ "has_zero_point": false,
372
+ "pre_quant_scale": true
373
+ },
374
+ "transformer.layers.12.mlp.proj": {
375
+ "quant_algo": "W4A8_AWQ",
376
+ "group_size": 128,
377
+ "has_zero_point": false,
378
+ "pre_quant_scale": true
379
+ },
380
+ "transformer.layers.13.attention.qkv": {
381
+ "quant_algo": "W4A8_AWQ",
382
+ "group_size": 128,
383
+ "has_zero_point": false,
384
+ "pre_quant_scale": true
385
+ },
386
+ "transformer.layers.13.attention.dense": {
387
+ "quant_algo": "W4A8_AWQ",
388
+ "group_size": 128,
389
+ "has_zero_point": false,
390
+ "pre_quant_scale": true
391
+ },
392
+ "transformer.layers.13.mlp.fc": {
393
+ "quant_algo": "W4A16_AWQ",
394
+ "group_size": 128,
395
+ "has_zero_point": false,
396
+ "pre_quant_scale": true
397
+ },
398
+ "transformer.layers.13.mlp.gate": {
399
+ "quant_algo": "W4A16_AWQ",
400
+ "group_size": 128,
401
+ "has_zero_point": false,
402
+ "pre_quant_scale": true
403
+ },
404
+ "transformer.layers.13.mlp.proj": {
405
+ "quant_algo": "W4A16_AWQ",
406
+ "group_size": 128,
407
+ "has_zero_point": false,
408
+ "pre_quant_scale": true
409
+ },
410
+ "transformer.layers.14.attention.qkv": {
411
+ "quant_algo": "W4A8_AWQ",
412
+ "group_size": 128,
413
+ "has_zero_point": false,
414
+ "pre_quant_scale": true
415
+ },
416
+ "transformer.layers.14.attention.dense": {
417
+ "quant_algo": "W4A8_AWQ",
418
+ "group_size": 128,
419
+ "has_zero_point": false,
420
+ "pre_quant_scale": true
421
+ },
422
+ "transformer.layers.14.mlp.fc": {
423
+ "quant_algo": "W4A8_AWQ",
424
+ "group_size": 128,
425
+ "has_zero_point": false,
426
+ "pre_quant_scale": true
427
+ },
428
+ "transformer.layers.14.mlp.gate": {
429
+ "quant_algo": "W4A8_AWQ",
430
+ "group_size": 128,
431
+ "has_zero_point": false,
432
+ "pre_quant_scale": true
433
+ },
434
+ "transformer.layers.14.mlp.proj": {
435
+ "quant_algo": "W4A8_AWQ",
436
+ "group_size": 128,
437
+ "has_zero_point": false,
438
+ "pre_quant_scale": true
439
+ },
440
+ "transformer.layers.15.attention.qkv": {
441
+ "quant_algo": "W4A16_AWQ",
442
+ "group_size": 128,
443
+ "has_zero_point": false,
444
+ "pre_quant_scale": true
445
+ },
446
+ "transformer.layers.15.attention.dense": {
447
+ "quant_algo": "W4A8_AWQ",
448
+ "group_size": 128,
449
+ "has_zero_point": false,
450
+ "pre_quant_scale": true
451
+ },
452
+ "transformer.layers.15.mlp.fc": {
453
+ "quant_algo": "FP8"
454
+ },
455
+ "transformer.layers.15.mlp.gate": {
456
+ "quant_algo": "FP8"
457
+ },
458
+ "transformer.layers.15.mlp.proj": {
459
+ "quant_algo": "W4A8_AWQ",
460
+ "group_size": 128,
461
+ "has_zero_point": false,
462
+ "pre_quant_scale": true
463
+ },
464
+ "transformer.layers.16.attention.qkv": {
465
+ "quant_algo": "W4A16_AWQ",
466
+ "group_size": 128,
467
+ "has_zero_point": false,
468
+ "pre_quant_scale": true
469
+ },
470
+ "transformer.layers.16.attention.dense": {
471
+ "quant_algo": "W4A8_AWQ",
472
+ "group_size": 128,
473
+ "has_zero_point": false,
474
+ "pre_quant_scale": true
475
+ },
476
+ "transformer.layers.16.mlp.fc": {
477
+ "quant_algo": "W4A8_AWQ",
478
+ "group_size": 128,
479
+ "has_zero_point": false,
480
+ "pre_quant_scale": true
481
+ },
482
+ "transformer.layers.16.mlp.gate": {
483
+ "quant_algo": "W4A8_AWQ",
484
+ "group_size": 128,
485
+ "has_zero_point": false,
486
+ "pre_quant_scale": true
487
+ },
488
+ "transformer.layers.16.mlp.proj": {
489
+ "quant_algo": "W4A16_AWQ",
490
+ "group_size": 128,
491
+ "has_zero_point": false,
492
+ "pre_quant_scale": true
493
+ },
494
+ "transformer.layers.17.attention.qkv": {
495
+ "quant_algo": "W4A8_AWQ",
496
+ "group_size": 128,
497
+ "has_zero_point": false,
498
+ "pre_quant_scale": true
499
+ },
500
+ "transformer.layers.17.attention.dense": {
501
+ "quant_algo": "W4A8_AWQ",
502
+ "group_size": 128,
503
+ "has_zero_point": false,
504
+ "pre_quant_scale": true
505
+ },
506
+ "transformer.layers.17.mlp.fc": {
507
+ "quant_algo": "W4A8_AWQ",
508
+ "group_size": 128,
509
+ "has_zero_point": false,
510
+ "pre_quant_scale": true
511
+ },
512
+ "transformer.layers.17.mlp.gate": {
513
+ "quant_algo": "W4A8_AWQ",
514
+ "group_size": 128,
515
+ "has_zero_point": false,
516
+ "pre_quant_scale": true
517
+ },
518
+ "transformer.layers.17.mlp.proj": {
519
+ "quant_algo": "W4A16_AWQ",
520
+ "group_size": 128,
521
+ "has_zero_point": false,
522
+ "pre_quant_scale": true
523
+ },
524
+ "transformer.layers.18.attention.qkv": {
525
+ "quant_algo": "W4A8_AWQ",
526
+ "group_size": 128,
527
+ "has_zero_point": false,
528
+ "pre_quant_scale": true
529
+ },
530
+ "transformer.layers.18.attention.dense": {
531
+ "quant_algo": "W4A8_AWQ",
532
+ "group_size": 128,
533
+ "has_zero_point": false,
534
+ "pre_quant_scale": true
535
+ },
536
+ "transformer.layers.18.mlp.fc": {
537
+ "quant_algo": "W4A8_AWQ",
538
+ "group_size": 128,
539
+ "has_zero_point": false,
540
+ "pre_quant_scale": true
541
+ },
542
+ "transformer.layers.18.mlp.gate": {
543
+ "quant_algo": "W4A8_AWQ",
544
+ "group_size": 128,
545
+ "has_zero_point": false,
546
+ "pre_quant_scale": true
547
+ },
548
+ "transformer.layers.18.mlp.proj": {
549
+ "quant_algo": "W4A8_AWQ",
550
+ "group_size": 128,
551
+ "has_zero_point": false,
552
+ "pre_quant_scale": true
553
+ },
554
+ "transformer.layers.19.attention.qkv": {
555
+ "quant_algo": "W4A8_AWQ",
556
+ "group_size": 128,
557
+ "has_zero_point": false,
558
+ "pre_quant_scale": true
559
+ },
560
+ "transformer.layers.19.attention.dense": {
561
+ "quant_algo": "W4A16_AWQ",
562
+ "group_size": 128,
563
+ "has_zero_point": false,
564
+ "pre_quant_scale": true
565
+ },
566
+ "transformer.layers.19.mlp.fc": {
567
+ "quant_algo": "W4A8_AWQ",
568
+ "group_size": 128,
569
+ "has_zero_point": false,
570
+ "pre_quant_scale": true
571
+ },
572
+ "transformer.layers.19.mlp.gate": {
573
+ "quant_algo": "W4A8_AWQ",
574
+ "group_size": 128,
575
+ "has_zero_point": false,
576
+ "pre_quant_scale": true
577
+ },
578
+ "transformer.layers.19.mlp.proj": {
579
+ "quant_algo": "W4A16_AWQ",
580
+ "group_size": 128,
581
+ "has_zero_point": false,
582
+ "pre_quant_scale": true
583
+ },
584
+ "transformer.layers.20.attention.qkv": {
585
+ "quant_algo": "W4A8_AWQ",
586
+ "group_size": 128,
587
+ "has_zero_point": false,
588
+ "pre_quant_scale": true
589
+ },
590
+ "transformer.layers.20.attention.dense": {
591
+ "quant_algo": "W4A16_AWQ",
592
+ "group_size": 128,
593
+ "has_zero_point": false,
594
+ "pre_quant_scale": true
595
+ },
596
+ "transformer.layers.20.mlp.fc": {
597
+ "quant_algo": "W4A8_AWQ",
598
+ "group_size": 128,
599
+ "has_zero_point": false,
600
+ "pre_quant_scale": true
601
+ },
602
+ "transformer.layers.20.mlp.gate": {
603
+ "quant_algo": "W4A8_AWQ",
604
+ "group_size": 128,
605
+ "has_zero_point": false,
606
+ "pre_quant_scale": true
607
+ },
608
+ "transformer.layers.20.mlp.proj": {
609
+ "quant_algo": "W4A16_AWQ",
610
+ "group_size": 128,
611
+ "has_zero_point": false,
612
+ "pre_quant_scale": true
613
+ },
614
+ "transformer.layers.21.attention.qkv": {
615
+ "quant_algo": "W4A16_AWQ",
616
+ "group_size": 128,
617
+ "has_zero_point": false,
618
+ "pre_quant_scale": true
619
+ },
620
+ "transformer.layers.21.attention.dense": {
621
+ "quant_algo": "W4A8_AWQ",
622
+ "group_size": 128,
623
+ "has_zero_point": false,
624
+ "pre_quant_scale": true
625
+ },
626
+ "transformer.layers.21.mlp.fc": {
627
+ "quant_algo": "W4A8_AWQ",
628
+ "group_size": 128,
629
+ "has_zero_point": false,
630
+ "pre_quant_scale": true
631
+ },
632
+ "transformer.layers.21.mlp.gate": {
633
+ "quant_algo": "W4A8_AWQ",
634
+ "group_size": 128,
635
+ "has_zero_point": false,
636
+ "pre_quant_scale": true
637
+ },
638
+ "transformer.layers.21.mlp.proj": {
639
+ "quant_algo": "W4A16_AWQ",
640
+ "group_size": 128,
641
+ "has_zero_point": false,
642
+ "pre_quant_scale": true
643
+ },
644
+ "transformer.layers.22.attention.qkv": {
645
+ "quant_algo": "W4A16_AWQ",
646
+ "group_size": 128,
647
+ "has_zero_point": false,
648
+ "pre_quant_scale": true
649
+ },
650
+ "transformer.layers.22.attention.dense": {
651
+ "quant_algo": "W4A16_AWQ",
652
+ "group_size": 128,
653
+ "has_zero_point": false,
654
+ "pre_quant_scale": true
655
+ },
656
+ "transformer.layers.22.mlp.fc": {
657
+ "quant_algo": "W4A8_AWQ",
658
+ "group_size": 128,
659
+ "has_zero_point": false,
660
+ "pre_quant_scale": true
661
+ },
662
+ "transformer.layers.22.mlp.gate": {
663
+ "quant_algo": "W4A8_AWQ",
664
+ "group_size": 128,
665
+ "has_zero_point": false,
666
+ "pre_quant_scale": true
667
+ },
668
+ "transformer.layers.22.mlp.proj": {
669
+ "quant_algo": "W4A8_AWQ",
670
+ "group_size": 128,
671
+ "has_zero_point": false,
672
+ "pre_quant_scale": true
673
+ },
674
+ "transformer.layers.23.attention.qkv": {
675
+ "quant_algo": "W4A16_AWQ",
676
+ "group_size": 128,
677
+ "has_zero_point": false,
678
+ "pre_quant_scale": true
679
+ },
680
+ "transformer.layers.23.attention.dense": {
681
+ "quant_algo": "W4A8_AWQ",
682
+ "group_size": 128,
683
+ "has_zero_point": false,
684
+ "pre_quant_scale": true
685
+ },
686
+ "transformer.layers.23.mlp.fc": {
687
+ "quant_algo": "W4A8_AWQ",
688
+ "group_size": 128,
689
+ "has_zero_point": false,
690
+ "pre_quant_scale": true
691
+ },
692
+ "transformer.layers.23.mlp.gate": {
693
+ "quant_algo": "W4A8_AWQ",
694
+ "group_size": 128,
695
+ "has_zero_point": false,
696
+ "pre_quant_scale": true
697
+ },
698
+ "transformer.layers.23.mlp.proj": {
699
+ "quant_algo": "W4A16_AWQ",
700
+ "group_size": 128,
701
+ "has_zero_point": false,
702
+ "pre_quant_scale": true
703
+ },
704
+ "transformer.layers.24.attention.qkv": {
705
+ "quant_algo": "W4A16_AWQ",
706
+ "group_size": 128,
707
+ "has_zero_point": false,
708
+ "pre_quant_scale": true
709
+ },
710
+ "transformer.layers.24.attention.dense": {
711
+ "quant_algo": "W4A16_AWQ",
712
+ "group_size": 128,
713
+ "has_zero_point": false,
714
+ "pre_quant_scale": true
715
+ },
716
+ "transformer.layers.24.mlp.fc": {
717
+ "quant_algo": "W4A8_AWQ",
718
+ "group_size": 128,
719
+ "has_zero_point": false,
720
+ "pre_quant_scale": true
721
+ },
722
+ "transformer.layers.24.mlp.gate": {
723
+ "quant_algo": "W4A8_AWQ",
724
+ "group_size": 128,
725
+ "has_zero_point": false,
726
+ "pre_quant_scale": true
727
+ },
728
+ "transformer.layers.24.mlp.proj": {
729
+ "quant_algo": "W4A16_AWQ",
730
+ "group_size": 128,
731
+ "has_zero_point": false,
732
+ "pre_quant_scale": true
733
+ },
734
+ "transformer.layers.25.attention.qkv": {
735
+ "quant_algo": "W4A8_AWQ",
736
+ "group_size": 128,
737
+ "has_zero_point": false,
738
+ "pre_quant_scale": true
739
+ },
740
+ "transformer.layers.25.attention.dense": {
741
+ "quant_algo": "W4A8_AWQ",
742
+ "group_size": 128,
743
+ "has_zero_point": false,
744
+ "pre_quant_scale": true
745
+ },
746
+ "transformer.layers.25.mlp.fc": {
747
+ "quant_algo": "W4A16_AWQ",
748
+ "group_size": 128,
749
+ "has_zero_point": false,
750
+ "pre_quant_scale": true
751
+ },
752
+ "transformer.layers.25.mlp.gate": {
753
+ "quant_algo": "W4A16_AWQ",
754
+ "group_size": 128,
755
+ "has_zero_point": false,
756
+ "pre_quant_scale": true
757
+ },
758
+ "transformer.layers.25.mlp.proj": {
759
+ "quant_algo": "W4A8_AWQ",
760
+ "group_size": 128,
761
+ "has_zero_point": false,
762
+ "pre_quant_scale": true
763
+ },
764
+ "transformer.layers.26.attention.qkv": {
765
+ "quant_algo": "W4A8_AWQ",
766
+ "group_size": 128,
767
+ "has_zero_point": false,
768
+ "pre_quant_scale": true
769
+ },
770
+ "transformer.layers.26.attention.dense": {
771
+ "quant_algo": "W4A8_AWQ",
772
+ "group_size": 128,
773
+ "has_zero_point": false,
774
+ "pre_quant_scale": true
775
+ },
776
+ "transformer.layers.26.mlp.fc": {
777
+ "quant_algo": "W4A16_AWQ",
778
+ "group_size": 128,
779
+ "has_zero_point": false,
780
+ "pre_quant_scale": true
781
+ },
782
+ "transformer.layers.26.mlp.gate": {
783
+ "quant_algo": "W4A16_AWQ",
784
+ "group_size": 128,
785
+ "has_zero_point": false,
786
+ "pre_quant_scale": true
787
+ },
788
+ "transformer.layers.26.mlp.proj": {
789
+ "quant_algo": "W4A8_AWQ",
790
+ "group_size": 128,
791
+ "has_zero_point": false,
792
+ "pre_quant_scale": true
793
+ },
794
+ "transformer.layers.27.attention.qkv": {
795
+ "quant_algo": "W4A8_AWQ",
796
+ "group_size": 128,
797
+ "has_zero_point": false,
798
+ "pre_quant_scale": true
799
+ },
800
+ "transformer.layers.27.attention.dense": {
801
+ "quant_algo": "W4A8_AWQ",
802
+ "group_size": 128,
803
+ "has_zero_point": false,
804
+ "pre_quant_scale": true
805
+ },
806
+ "transformer.layers.27.mlp.fc": {
807
+ "quant_algo": "W4A8_AWQ",
808
+ "group_size": 128,
809
+ "has_zero_point": false,
810
+ "pre_quant_scale": true
811
+ },
812
+ "transformer.layers.27.mlp.gate": {
813
+ "quant_algo": "W4A8_AWQ",
814
+ "group_size": 128,
815
+ "has_zero_point": false,
816
+ "pre_quant_scale": true
817
+ },
818
+ "transformer.layers.27.mlp.proj": {
819
+ "quant_algo": "W4A16_AWQ",
820
+ "group_size": 128,
821
+ "has_zero_point": false,
822
+ "pre_quant_scale": true
823
+ },
824
+ "transformer.layers.28.attention.qkv": {
825
+ "quant_algo": "W4A8_AWQ",
826
+ "group_size": 128,
827
+ "has_zero_point": false,
828
+ "pre_quant_scale": true
829
+ },
830
+ "transformer.layers.28.attention.dense": {
831
+ "quant_algo": "W4A8_AWQ",
832
+ "group_size": 128,
833
+ "has_zero_point": false,
834
+ "pre_quant_scale": true
835
+ },
836
+ "transformer.layers.28.mlp.fc": {
837
+ "quant_algo": "W4A16_AWQ",
838
+ "group_size": 128,
839
+ "has_zero_point": false,
840
+ "pre_quant_scale": true
841
+ },
842
+ "transformer.layers.28.mlp.gate": {
843
+ "quant_algo": "W4A16_AWQ",
844
+ "group_size": 128,
845
+ "has_zero_point": false,
846
+ "pre_quant_scale": true
847
+ },
848
+ "transformer.layers.28.mlp.proj": {
849
+ "quant_algo": "W4A16_AWQ",
850
+ "group_size": 128,
851
+ "has_zero_point": false,
852
+ "pre_quant_scale": true
853
+ },
854
+ "transformer.layers.29.attention.qkv": {
855
+ "quant_algo": "FP8"
856
+ },
857
+ "transformer.layers.29.attention.dense": {
858
+ "quant_algo": "FP8"
859
+ },
860
+ "transformer.layers.29.mlp.fc": {
861
+ "quant_algo": "W4A8_AWQ",
862
+ "group_size": 128,
863
+ "has_zero_point": false,
864
+ "pre_quant_scale": true
865
+ },
866
+ "transformer.layers.29.mlp.gate": {
867
+ "quant_algo": "W4A8_AWQ",
868
+ "group_size": 128,
869
+ "has_zero_point": false,
870
+ "pre_quant_scale": true
871
+ },
872
+ "transformer.layers.29.mlp.proj": {
873
+ "quant_algo": "W4A8_AWQ",
874
+ "group_size": 128,
875
+ "has_zero_point": false,
876
+ "pre_quant_scale": true
877
+ },
878
+ "transformer.layers.30.attention.qkv": {
879
+ "quant_algo": "FP8"
880
+ },
881
+ "transformer.layers.30.attention.dense": {
882
+ "quant_algo": "FP8"
883
+ },
884
+ "transformer.layers.30.mlp.fc": {
885
+ "quant_algo": "W4A8_AWQ",
886
+ "group_size": 128,
887
+ "has_zero_point": false,
888
+ "pre_quant_scale": true
889
+ },
890
+ "transformer.layers.30.mlp.gate": {
891
+ "quant_algo": "W4A8_AWQ",
892
+ "group_size": 128,
893
+ "has_zero_point": false,
894
+ "pre_quant_scale": true
895
+ },
896
+ "transformer.layers.30.mlp.proj": {
897
+ "quant_algo": "W4A8_AWQ",
898
+ "group_size": 128,
899
+ "has_zero_point": false,
900
+ "pre_quant_scale": true
901
+ },
902
+ "transformer.layers.31.attention.qkv": {
903
+ "quant_algo": "FP8"
904
+ },
905
+ "transformer.layers.31.attention.dense": {
906
+ "quant_algo": "FP8"
907
+ },
908
+ "transformer.layers.31.mlp.fc": {
909
+ "quant_algo": "W4A8_AWQ",
910
+ "group_size": 128,
911
+ "has_zero_point": false,
912
+ "pre_quant_scale": true
913
+ },
914
+ "transformer.layers.31.mlp.gate": {
915
+ "quant_algo": "W4A8_AWQ",
916
+ "group_size": 128,
917
+ "has_zero_point": false,
918
+ "pre_quant_scale": true
919
+ },
920
+ "transformer.layers.31.mlp.proj": {
921
+ "quant_algo": "W4A16_AWQ",
922
+ "group_size": 128,
923
+ "has_zero_point": false,
924
+ "pre_quant_scale": true
925
+ },
926
+ "transformer.layers.32.attention.qkv": {
927
+ "quant_algo": "FP8"
928
+ },
929
+ "transformer.layers.32.attention.dense": {
930
+ "quant_algo": "FP8"
931
+ },
932
+ "transformer.layers.32.mlp.fc": {
933
+ "quant_algo": "W4A16_AWQ",
934
+ "group_size": 128,
935
+ "has_zero_point": false,
936
+ "pre_quant_scale": true
937
+ },
938
+ "transformer.layers.32.mlp.gate": {
939
+ "quant_algo": "W4A16_AWQ",
940
+ "group_size": 128,
941
+ "has_zero_point": false,
942
+ "pre_quant_scale": true
943
+ },
944
+ "transformer.layers.32.mlp.proj": {
945
+ "quant_algo": "W4A8_AWQ",
946
+ "group_size": 128,
947
+ "has_zero_point": false,
948
+ "pre_quant_scale": true
949
+ },
950
+ "transformer.layers.33.attention.qkv": {
951
+ "quant_algo": "FP8"
952
+ },
953
+ "transformer.layers.33.attention.dense": {
954
+ "quant_algo": "FP8"
955
+ },
956
+ "transformer.layers.33.mlp.fc": {
957
+ "quant_algo": "W4A16_AWQ",
958
+ "group_size": 128,
959
+ "has_zero_point": false,
960
+ "pre_quant_scale": true
961
+ },
962
+ "transformer.layers.33.mlp.gate": {
963
+ "quant_algo": "W4A16_AWQ",
964
+ "group_size": 128,
965
+ "has_zero_point": false,
966
+ "pre_quant_scale": true
967
+ },
968
+ "transformer.layers.33.mlp.proj": {
969
+ "quant_algo": "W4A16_AWQ",
970
+ "group_size": 128,
971
+ "has_zero_point": false,
972
+ "pre_quant_scale": true
973
+ },
974
+ "transformer.layers.34.attention.qkv": {
975
+ "quant_algo": "FP8"
976
+ },
977
+ "transformer.layers.34.attention.dense": {
978
+ "quant_algo": "FP8"
979
+ },
980
+ "transformer.layers.34.mlp.fc": {
981
+ "quant_algo": "W4A8_AWQ",
982
+ "group_size": 128,
983
+ "has_zero_point": false,
984
+ "pre_quant_scale": true
985
+ },
986
+ "transformer.layers.34.mlp.gate": {
987
+ "quant_algo": "W4A8_AWQ",
988
+ "group_size": 128,
989
+ "has_zero_point": false,
990
+ "pre_quant_scale": true
991
+ },
992
+ "transformer.layers.34.mlp.proj": {
993
+ "quant_algo": "W4A8_AWQ",
994
+ "group_size": 128,
995
+ "has_zero_point": false,
996
+ "pre_quant_scale": true
997
+ },
998
+ "transformer.layers.35.attention.qkv": {
999
+ "quant_algo": "FP8"
1000
+ },
1001
+ "transformer.layers.35.attention.dense": {
1002
+ "quant_algo": "FP8"
1003
+ },
1004
+ "transformer.layers.35.mlp.fc": {
1005
+ "quant_algo": "W4A8_AWQ",
1006
+ "group_size": 128,
1007
+ "has_zero_point": false,
1008
+ "pre_quant_scale": true
1009
+ },
1010
+ "transformer.layers.35.mlp.gate": {
1011
+ "quant_algo": "W4A8_AWQ",
1012
+ "group_size": 128,
1013
+ "has_zero_point": false,
1014
+ "pre_quant_scale": true
1015
+ },
1016
+ "transformer.layers.35.mlp.proj": {
1017
+ "quant_algo": "W4A8_AWQ",
1018
+ "group_size": 128,
1019
+ "has_zero_point": false,
1020
+ "pre_quant_scale": true
1021
+ },
1022
+ "transformer.layers.36.attention.qkv": {
1023
+ "quant_algo": "FP8"
1024
+ },
1025
+ "transformer.layers.36.attention.dense": {
1026
+ "quant_algo": "FP8"
1027
+ },
1028
+ "transformer.layers.36.mlp.fc": {
1029
+ "quant_algo": "W4A8_AWQ",
1030
+ "group_size": 128,
1031
+ "has_zero_point": false,
1032
+ "pre_quant_scale": true
1033
+ },
1034
+ "transformer.layers.36.mlp.gate": {
1035
+ "quant_algo": "W4A8_AWQ",
1036
+ "group_size": 128,
1037
+ "has_zero_point": false,
1038
+ "pre_quant_scale": true
1039
+ },
1040
+ "transformer.layers.36.mlp.proj": {
1041
+ "quant_algo": "W4A16_AWQ",
1042
+ "group_size": 128,
1043
+ "has_zero_point": false,
1044
+ "pre_quant_scale": true
1045
+ },
1046
+ "transformer.layers.37.attention.qkv": {
1047
+ "quant_algo": "FP8"
1048
+ },
1049
+ "transformer.layers.37.attention.dense": {
1050
+ "quant_algo": "FP8"
1051
+ },
1052
+ "transformer.layers.37.mlp.fc": {
1053
+ "quant_algo": "W4A16_AWQ",
1054
+ "group_size": 128,
1055
+ "has_zero_point": false,
1056
+ "pre_quant_scale": true
1057
+ },
1058
+ "transformer.layers.37.mlp.gate": {
1059
+ "quant_algo": "W4A16_AWQ",
1060
+ "group_size": 128,
1061
+ "has_zero_point": false,
1062
+ "pre_quant_scale": true
1063
+ },
1064
+ "transformer.layers.37.mlp.proj": {
1065
+ "quant_algo": "FP8"
1066
+ },
1067
+ "transformer.layers.38.attention.qkv": {
1068
+ "quant_algo": "FP8"
1069
+ },
1070
+ "transformer.layers.38.attention.dense": {
1071
+ "quant_algo": "FP8"
1072
+ },
1073
+ "transformer.layers.38.mlp.fc": {
1074
+ "quant_algo": "W4A8_AWQ",
1075
+ "group_size": 128,
1076
+ "has_zero_point": false,
1077
+ "pre_quant_scale": true
1078
+ },
1079
+ "transformer.layers.38.mlp.gate": {
1080
+ "quant_algo": "W4A8_AWQ",
1081
+ "group_size": 128,
1082
+ "has_zero_point": false,
1083
+ "pre_quant_scale": true
1084
+ },
1085
+ "transformer.layers.38.mlp.proj": {
1086
+ "quant_algo": "FP8"
1087
+ },
1088
+ "transformer.layers.39.attention.qkv": {
1089
+ "quant_algo": "FP8"
1090
+ },
1091
+ "transformer.layers.39.attention.dense": {
1092
+ "quant_algo": "FP8"
1093
+ },
1094
+ "transformer.layers.39.mlp.fc": {
1095
+ "quant_algo": "W4A8_AWQ",
1096
+ "group_size": 128,
1097
+ "has_zero_point": false,
1098
+ "pre_quant_scale": true
1099
+ },
1100
+ "transformer.layers.39.mlp.gate": {
1101
+ "quant_algo": "W4A8_AWQ",
1102
+ "group_size": 128,
1103
+ "has_zero_point": false,
1104
+ "pre_quant_scale": true
1105
+ },
1106
+ "transformer.layers.39.mlp.proj": {
1107
+ "quant_algo": "FP8"
1108
+ },
1109
+ "transformer.layers.40.attention.qkv": {
1110
+ "quant_algo": "W4A16_AWQ",
1111
+ "group_size": 128,
1112
+ "has_zero_point": false,
1113
+ "pre_quant_scale": true
1114
+ },
1115
+ "transformer.layers.40.attention.dense": {
1116
+ "quant_algo": "FP8"
1117
+ },
1118
+ "transformer.layers.40.mlp.fc": {
1119
+ "quant_algo": "W4A8_AWQ",
1120
+ "group_size": 128,
1121
+ "has_zero_point": false,
1122
+ "pre_quant_scale": true
1123
+ },
1124
+ "transformer.layers.40.mlp.gate": {
1125
+ "quant_algo": "W4A8_AWQ",
1126
+ "group_size": 128,
1127
+ "has_zero_point": false,
1128
+ "pre_quant_scale": true
1129
+ },
1130
+ "transformer.layers.40.mlp.proj": {
1131
+ "quant_algo": "FP8"
1132
+ },
1133
+ "transformer.layers.41.attention.qkv": {
1134
+ "quant_algo": "W4A8_AWQ",
1135
+ "group_size": 128,
1136
+ "has_zero_point": false,
1137
+ "pre_quant_scale": true
1138
+ },
1139
+ "transformer.layers.41.attention.dense": {
1140
+ "quant_algo": "FP8"
1141
+ },
1142
+ "transformer.layers.41.mlp.fc": {
1143
+ "quant_algo": "W4A8_AWQ",
1144
+ "group_size": 128,
1145
+ "has_zero_point": false,
1146
+ "pre_quant_scale": true
1147
+ },
1148
+ "transformer.layers.41.mlp.gate": {
1149
+ "quant_algo": "W4A8_AWQ",
1150
+ "group_size": 128,
1151
+ "has_zero_point": false,
1152
+ "pre_quant_scale": true
1153
+ },
1154
+ "transformer.layers.41.mlp.proj": {
1155
+ "quant_algo": "FP8"
1156
+ },
1157
+ "transformer.layers.42.attention.qkv": {
1158
+ "quant_algo": "W4A16_AWQ",
1159
+ "group_size": 128,
1160
+ "has_zero_point": false,
1161
+ "pre_quant_scale": true
1162
+ },
1163
+ "transformer.layers.42.attention.dense": {
1164
+ "quant_algo": "FP8"
1165
+ },
1166
+ "transformer.layers.42.mlp.fc": {
1167
+ "quant_algo": "W4A8_AWQ",
1168
+ "group_size": 128,
1169
+ "has_zero_point": false,
1170
+ "pre_quant_scale": true
1171
+ },
1172
+ "transformer.layers.42.mlp.gate": {
1173
+ "quant_algo": "W4A8_AWQ",
1174
+ "group_size": 128,
1175
+ "has_zero_point": false,
1176
+ "pre_quant_scale": true
1177
+ },
1178
+ "transformer.layers.42.mlp.proj": {
1179
+ "quant_algo": "FP8"
1180
+ },
1181
+ "transformer.layers.43.attention.qkv": {
1182
+ "quant_algo": "W4A8_AWQ",
1183
+ "group_size": 128,
1184
+ "has_zero_point": false,
1185
+ "pre_quant_scale": true
1186
+ },
1187
+ "transformer.layers.43.attention.dense": {
1188
+ "quant_algo": "FP8"
1189
+ },
1190
+ "transformer.layers.43.mlp.fc": {
1191
+ "quant_algo": "W4A8_AWQ",
1192
+ "group_size": 128,
1193
+ "has_zero_point": false,
1194
+ "pre_quant_scale": true
1195
+ },
1196
+ "transformer.layers.43.mlp.gate": {
1197
+ "quant_algo": "W4A8_AWQ",
1198
+ "group_size": 128,
1199
+ "has_zero_point": false,
1200
+ "pre_quant_scale": true
1201
+ },
1202
+ "transformer.layers.43.mlp.proj": {
1203
+ "quant_algo": "FP8"
1204
+ },
1205
+ "transformer.layers.44.attention.qkv": {
1206
+ "quant_algo": "W4A16_AWQ",
1207
+ "group_size": 128,
1208
+ "has_zero_point": false,
1209
+ "pre_quant_scale": true
1210
+ },
1211
+ "transformer.layers.44.attention.dense": {
1212
+ "quant_algo": "FP8"
1213
+ },
1214
+ "transformer.layers.44.mlp.fc": {
1215
+ "quant_algo": "FP8"
1216
+ },
1217
+ "transformer.layers.44.mlp.gate": {
1218
+ "quant_algo": "FP8"
1219
+ },
1220
+ "transformer.layers.44.mlp.proj": {
1221
+ "quant_algo": "FP8"
1222
+ },
1223
+ "transformer.layers.45.attention.qkv": {
1224
+ "quant_algo": "W4A16_AWQ",
1225
+ "group_size": 128,
1226
+ "has_zero_point": false,
1227
+ "pre_quant_scale": true
1228
+ },
1229
+ "transformer.layers.45.attention.dense": {
1230
+ "quant_algo": "FP8"
1231
+ },
1232
+ "transformer.layers.45.mlp.fc": {
1233
+ "quant_algo": "FP8"
1234
+ },
1235
+ "transformer.layers.45.mlp.gate": {
1236
+ "quant_algo": "FP8"
1237
+ },
1238
+ "transformer.layers.45.mlp.proj": {
1239
+ "quant_algo": "FP8"
1240
+ },
1241
+ "transformer.layers.46.attention.qkv": {
1242
+ "quant_algo": "W4A16_AWQ",
1243
+ "group_size": 128,
1244
+ "has_zero_point": false,
1245
+ "pre_quant_scale": true
1246
+ },
1247
+ "transformer.layers.46.attention.dense": {
1248
+ "quant_algo": "W4A16_AWQ",
1249
+ "group_size": 128,
1250
+ "has_zero_point": false,
1251
+ "pre_quant_scale": true
1252
+ },
1253
+ "transformer.layers.46.mlp.fc": {
1254
+ "quant_algo": "FP8"
1255
+ },
1256
+ "transformer.layers.46.mlp.gate": {
1257
+ "quant_algo": "FP8"
1258
+ },
1259
+ "transformer.layers.46.mlp.proj": {
1260
+ "quant_algo": "FP8"
1261
+ },
1262
+ "transformer.layers.47.attention.qkv": {
1263
+ "quant_algo": "W4A16_AWQ",
1264
+ "group_size": 128,
1265
+ "has_zero_point": false,
1266
+ "pre_quant_scale": true
1267
+ },
1268
+ "transformer.layers.47.attention.dense": {
1269
+ "quant_algo": "FP8"
1270
+ },
1271
+ "transformer.layers.47.mlp.fc": {
1272
+ "quant_algo": "FP8"
1273
+ },
1274
+ "transformer.layers.47.mlp.gate": {
1275
+ "quant_algo": "FP8"
1276
+ },
1277
+ "transformer.layers.47.mlp.proj": {
1278
+ "quant_algo": "FP8"
1279
+ },
1280
+ "transformer.layers.48.attention.qkv": {
1281
+ "quant_algo": "W4A8_AWQ",
1282
+ "group_size": 128,
1283
+ "has_zero_point": false,
1284
+ "pre_quant_scale": true
1285
+ },
1286
+ "transformer.layers.48.attention.dense": {
1287
+ "quant_algo": "FP8"
1288
+ },
1289
+ "transformer.layers.48.mlp.fc": {
1290
+ "quant_algo": "FP8"
1291
+ },
1292
+ "transformer.layers.48.mlp.gate": {
1293
+ "quant_algo": "FP8"
1294
+ },
1295
+ "transformer.layers.48.mlp.proj": {
1296
+ "quant_algo": "FP8"
1297
+ },
1298
+ "transformer.layers.49.attention.qkv": {
1299
+ "quant_algo": "W4A16_AWQ",
1300
+ "group_size": 128,
1301
+ "has_zero_point": false,
1302
+ "pre_quant_scale": true
1303
+ },
1304
+ "transformer.layers.49.attention.dense": {
1305
+ "quant_algo": "FP8"
1306
+ },
1307
+ "transformer.layers.49.mlp.fc": {
1308
+ "quant_algo": "FP8"
1309
+ },
1310
+ "transformer.layers.49.mlp.gate": {
1311
+ "quant_algo": "FP8"
1312
+ },
1313
+ "transformer.layers.49.mlp.proj": {
1314
+ "quant_algo": "FP8"
1315
+ },
1316
+ "transformer.layers.50.attention.qkv": {
1317
+ "quant_algo": "W4A16_AWQ",
1318
+ "group_size": 128,
1319
+ "has_zero_point": false,
1320
+ "pre_quant_scale": true
1321
+ },
1322
+ "transformer.layers.50.attention.dense": {
1323
+ "quant_algo": "W4A16_AWQ",
1324
+ "group_size": 128,
1325
+ "has_zero_point": false,
1326
+ "pre_quant_scale": true
1327
+ },
1328
+ "transformer.layers.50.mlp.fc": {
1329
+ "quant_algo": "FP8"
1330
+ },
1331
+ "transformer.layers.50.mlp.gate": {
1332
+ "quant_algo": "FP8"
1333
+ },
1334
+ "transformer.layers.50.mlp.proj": {
1335
+ "quant_algo": "FP8"
1336
+ },
1337
+ "transformer.layers.51.attention.qkv": {
1338
+ "quant_algo": "W4A8_AWQ",
1339
+ "group_size": 128,
1340
+ "has_zero_point": false,
1341
+ "pre_quant_scale": true
1342
+ },
1343
+ "transformer.layers.51.attention.dense": {
1344
+ "quant_algo": "W4A16_AWQ",
1345
+ "group_size": 128,
1346
+ "has_zero_point": false,
1347
+ "pre_quant_scale": true
1348
+ },
1349
+ "transformer.layers.51.mlp.fc": {
1350
+ "quant_algo": "FP8"
1351
+ },
1352
+ "transformer.layers.51.mlp.gate": {
1353
+ "quant_algo": "FP8"
1354
+ },
1355
+ "transformer.layers.51.mlp.proj": {
1356
+ "quant_algo": "FP8"
1357
+ },
1358
+ "transformer.layers.52.attention.qkv": {
1359
+ "quant_algo": "W4A16_AWQ",
1360
+ "group_size": 128,
1361
+ "has_zero_point": false,
1362
+ "pre_quant_scale": true
1363
+ },
1364
+ "transformer.layers.52.attention.dense": {
1365
+ "quant_algo": "W4A16_AWQ",
1366
+ "group_size": 128,
1367
+ "has_zero_point": false,
1368
+ "pre_quant_scale": true
1369
+ },
1370
+ "transformer.layers.52.mlp.fc": {
1371
+ "quant_algo": "FP8"
1372
+ },
1373
+ "transformer.layers.52.mlp.gate": {
1374
+ "quant_algo": "FP8"
1375
+ },
1376
+ "transformer.layers.52.mlp.proj": {
1377
+ "quant_algo": "FP8"
1378
+ },
1379
+ "transformer.layers.53.attention.qkv": {
1380
+ "quant_algo": "W4A16_AWQ",
1381
+ "group_size": 128,
1382
+ "has_zero_point": false,
1383
+ "pre_quant_scale": true
1384
+ },
1385
+ "transformer.layers.53.attention.dense": {
1386
+ "quant_algo": "W4A16_AWQ",
1387
+ "group_size": 128,
1388
+ "has_zero_point": false,
1389
+ "pre_quant_scale": true
1390
+ },
1391
+ "transformer.layers.53.mlp.fc": {
1392
+ "quant_algo": "FP8"
1393
+ },
1394
+ "transformer.layers.53.mlp.gate": {
1395
+ "quant_algo": "FP8"
1396
+ },
1397
+ "transformer.layers.53.mlp.proj": {
1398
+ "quant_algo": "FP8"
1399
+ },
1400
+ "transformer.layers.54.attention.qkv": {
1401
+ "quant_algo": "W4A8_AWQ",
1402
+ "group_size": 128,
1403
+ "has_zero_point": false,
1404
+ "pre_quant_scale": true
1405
+ },
1406
+ "transformer.layers.54.attention.dense": {
1407
+ "quant_algo": "W4A16_AWQ",
1408
+ "group_size": 128,
1409
+ "has_zero_point": false,
1410
+ "pre_quant_scale": true
1411
+ },
1412
+ "transformer.layers.54.mlp.fc": {
1413
+ "quant_algo": "W4A16_AWQ",
1414
+ "group_size": 128,
1415
+ "has_zero_point": false,
1416
+ "pre_quant_scale": true
1417
+ },
1418
+ "transformer.layers.54.mlp.gate": {
1419
+ "quant_algo": "W4A16_AWQ",
1420
+ "group_size": 128,
1421
+ "has_zero_point": false,
1422
+ "pre_quant_scale": true
1423
+ },
1424
+ "transformer.layers.54.mlp.proj": {
1425
+ "quant_algo": "FP8"
1426
+ },
1427
+ "transformer.layers.55.attention.qkv": {
1428
+ "quant_algo": "W4A16_AWQ",
1429
+ "group_size": 128,
1430
+ "has_zero_point": false,
1431
+ "pre_quant_scale": true
1432
+ },
1433
+ "transformer.layers.55.attention.dense": {
1434
+ "quant_algo": "W4A16_AWQ",
1435
+ "group_size": 128,
1436
+ "has_zero_point": false,
1437
+ "pre_quant_scale": true
1438
+ },
1439
+ "transformer.layers.55.mlp.fc": {
1440
+ "quant_algo": "W4A8_AWQ",
1441
+ "group_size": 128,
1442
+ "has_zero_point": false,
1443
+ "pre_quant_scale": true
1444
+ },
1445
+ "transformer.layers.55.mlp.gate": {
1446
+ "quant_algo": "W4A8_AWQ",
1447
+ "group_size": 128,
1448
+ "has_zero_point": false,
1449
+ "pre_quant_scale": true
1450
+ },
1451
+ "transformer.layers.55.mlp.proj": {
1452
+ "quant_algo": "FP8"
1453
+ },
1454
+ "transformer.layers.56.attention.qkv": {
1455
+ "quant_algo": "W4A8_AWQ",
1456
+ "group_size": 128,
1457
+ "has_zero_point": false,
1458
+ "pre_quant_scale": true
1459
+ },
1460
+ "transformer.layers.56.attention.dense": {
1461
+ "quant_algo": "W4A16_AWQ",
1462
+ "group_size": 128,
1463
+ "has_zero_point": false,
1464
+ "pre_quant_scale": true
1465
+ },
1466
+ "transformer.layers.56.mlp.fc": {
1467
+ "quant_algo": "W4A16_AWQ",
1468
+ "group_size": 128,
1469
+ "has_zero_point": false,
1470
+ "pre_quant_scale": true
1471
+ },
1472
+ "transformer.layers.56.mlp.gate": {
1473
+ "quant_algo": "W4A16_AWQ",
1474
+ "group_size": 128,
1475
+ "has_zero_point": false,
1476
+ "pre_quant_scale": true
1477
+ },
1478
+ "transformer.layers.56.mlp.proj": {
1479
+ "quant_algo": "FP8"
1480
+ },
1481
+ "transformer.layers.57.attention.qkv": {
1482
+ "quant_algo": "W4A16_AWQ",
1483
+ "group_size": 128,
1484
+ "has_zero_point": false,
1485
+ "pre_quant_scale": true
1486
+ },
1487
+ "transformer.layers.57.attention.dense": {
1488
+ "quant_algo": "W4A8_AWQ",
1489
+ "group_size": 128,
1490
+ "has_zero_point": false,
1491
+ "pre_quant_scale": true
1492
+ },
1493
+ "transformer.layers.57.mlp.fc": {
1494
+ "quant_algo": "W4A16_AWQ",
1495
+ "group_size": 128,
1496
+ "has_zero_point": false,
1497
+ "pre_quant_scale": true
1498
+ },
1499
+ "transformer.layers.57.mlp.gate": {
1500
+ "quant_algo": "W4A16_AWQ",
1501
+ "group_size": 128,
1502
+ "has_zero_point": false,
1503
+ "pre_quant_scale": true
1504
+ },
1505
+ "transformer.layers.57.mlp.proj": {
1506
+ "quant_algo": "FP8"
1507
+ },
1508
+ "transformer.layers.58.attention.qkv": {
1509
+ "quant_algo": "W4A8_AWQ",
1510
+ "group_size": 128,
1511
+ "has_zero_point": false,
1512
+ "pre_quant_scale": true
1513
+ },
1514
+ "transformer.layers.58.attention.dense": {
1515
+ "quant_algo": "W4A8_AWQ",
1516
+ "group_size": 128,
1517
+ "has_zero_point": false,
1518
+ "pre_quant_scale": true
1519
+ },
1520
+ "transformer.layers.58.mlp.fc": {
1521
+ "quant_algo": "W4A8_AWQ",
1522
+ "group_size": 128,
1523
+ "has_zero_point": false,
1524
+ "pre_quant_scale": true
1525
+ },
1526
+ "transformer.layers.58.mlp.gate": {
1527
+ "quant_algo": "W4A8_AWQ",
1528
+ "group_size": 128,
1529
+ "has_zero_point": false,
1530
+ "pre_quant_scale": true
1531
+ },
1532
+ "transformer.layers.58.mlp.proj": {
1533
+ "quant_algo": "W4A16_AWQ",
1534
+ "group_size": 128,
1535
+ "has_zero_point": false,
1536
+ "pre_quant_scale": true
1537
+ },
1538
+ "transformer.layers.59.attention.qkv": {
1539
+ "quant_algo": "W4A16_AWQ",
1540
+ "group_size": 128,
1541
+ "has_zero_point": false,
1542
+ "pre_quant_scale": true
1543
+ },
1544
+ "transformer.layers.59.attention.dense": {
1545
+ "quant_algo": "W4A16_AWQ",
1546
+ "group_size": 128,
1547
+ "has_zero_point": false,
1548
+ "pre_quant_scale": true
1549
+ },
1550
+ "transformer.layers.59.mlp.fc": {
1551
+ "quant_algo": "W4A16_AWQ",
1552
+ "group_size": 128,
1553
+ "has_zero_point": false,
1554
+ "pre_quant_scale": true
1555
+ },
1556
+ "transformer.layers.59.mlp.gate": {
1557
+ "quant_algo": "W4A16_AWQ",
1558
+ "group_size": 128,
1559
+ "has_zero_point": false,
1560
+ "pre_quant_scale": true
1561
+ },
1562
+ "transformer.layers.59.mlp.proj": {
1563
+ "quant_algo": "W4A8_AWQ",
1564
+ "group_size": 128,
1565
+ "has_zero_point": false,
1566
+ "pre_quant_scale": true
1567
+ },
1568
+ "transformer.layers.60.attention.qkv": {
1569
+ "quant_algo": "W4A8_AWQ",
1570
+ "group_size": 128,
1571
+ "has_zero_point": false,
1572
+ "pre_quant_scale": true
1573
+ },
1574
+ "transformer.layers.60.attention.dense": {
1575
+ "quant_algo": "W4A16_AWQ",
1576
+ "group_size": 128,
1577
+ "has_zero_point": false,
1578
+ "pre_quant_scale": true
1579
+ },
1580
+ "transformer.layers.60.mlp.fc": {
1581
+ "quant_algo": "W4A8_AWQ",
1582
+ "group_size": 128,
1583
+ "has_zero_point": false,
1584
+ "pre_quant_scale": true
1585
+ },
1586
+ "transformer.layers.60.mlp.gate": {
1587
+ "quant_algo": "W4A8_AWQ",
1588
+ "group_size": 128,
1589
+ "has_zero_point": false,
1590
+ "pre_quant_scale": true
1591
+ },
1592
+ "transformer.layers.60.mlp.proj": {
1593
+ "quant_algo": "W4A8_AWQ",
1594
+ "group_size": 128,
1595
+ "has_zero_point": false,
1596
+ "pre_quant_scale": true
1597
+ },
1598
+ "transformer.layers.61.attention.qkv": {
1599
+ "quant_algo": "W4A8_AWQ",
1600
+ "group_size": 128,
1601
+ "has_zero_point": false,
1602
+ "pre_quant_scale": true
1603
+ },
1604
+ "transformer.layers.61.attention.dense": {
1605
+ "quant_algo": "W4A16_AWQ",
1606
+ "group_size": 128,
1607
+ "has_zero_point": false,
1608
+ "pre_quant_scale": true
1609
+ },
1610
+ "transformer.layers.61.mlp.fc": {
1611
+ "quant_algo": "W4A8_AWQ",
1612
+ "group_size": 128,
1613
+ "has_zero_point": false,
1614
+ "pre_quant_scale": true
1615
+ },
1616
+ "transformer.layers.61.mlp.gate": {
1617
+ "quant_algo": "W4A8_AWQ",
1618
+ "group_size": 128,
1619
+ "has_zero_point": false,
1620
+ "pre_quant_scale": true
1621
+ },
1622
+ "transformer.layers.61.mlp.proj": {
1623
+ "quant_algo": "W4A8_AWQ",
1624
+ "group_size": 128,
1625
+ "has_zero_point": false,
1626
+ "pre_quant_scale": true
1627
+ },
1628
+ "transformer.layers.62.attention.qkv": {
1629
+ "quant_algo": "W4A16_AWQ",
1630
+ "group_size": 128,
1631
+ "has_zero_point": false,
1632
+ "pre_quant_scale": true
1633
+ },
1634
+ "transformer.layers.62.attention.dense": {
1635
+ "quant_algo": "W4A16_AWQ",
1636
+ "group_size": 128,
1637
+ "has_zero_point": false,
1638
+ "pre_quant_scale": true
1639
+ },
1640
+ "transformer.layers.62.mlp.fc": {
1641
+ "quant_algo": "W4A8_AWQ",
1642
+ "group_size": 128,
1643
+ "has_zero_point": false,
1644
+ "pre_quant_scale": true
1645
+ },
1646
+ "transformer.layers.62.mlp.gate": {
1647
+ "quant_algo": "W4A8_AWQ",
1648
+ "group_size": 128,
1649
+ "has_zero_point": false,
1650
+ "pre_quant_scale": true
1651
+ },
1652
+ "transformer.layers.62.mlp.proj": {
1653
+ "quant_algo": "W4A16_AWQ",
1654
+ "group_size": 128,
1655
+ "has_zero_point": false,
1656
+ "pre_quant_scale": true
1657
+ },
1658
+ "transformer.layers.63.attention.qkv": {
1659
+ "quant_algo": "W4A16_AWQ",
1660
+ "group_size": 128,
1661
+ "has_zero_point": false,
1662
+ "pre_quant_scale": true
1663
+ },
1664
+ "transformer.layers.63.attention.dense": {
1665
+ "quant_algo": "W4A16_AWQ",
1666
+ "group_size": 128,
1667
+ "has_zero_point": false,
1668
+ "pre_quant_scale": true
1669
+ },
1670
+ "transformer.layers.63.mlp.fc": {
1671
+ "quant_algo": "W4A16_AWQ",
1672
+ "group_size": 128,
1673
+ "has_zero_point": false,
1674
+ "pre_quant_scale": true
1675
+ },
1676
+ "transformer.layers.63.mlp.gate": {
1677
+ "quant_algo": "W4A16_AWQ",
1678
+ "group_size": 128,
1679
+ "has_zero_point": false,
1680
+ "pre_quant_scale": true
1681
+ },
1682
+ "transformer.layers.63.mlp.proj": {
1683
+ "quant_algo": "W4A8_AWQ",
1684
+ "group_size": 128,
1685
+ "has_zero_point": false,
1686
+ "pre_quant_scale": true
1687
+ },
1688
+ "transformer.layers.64.attention.qkv": {
1689
+ "quant_algo": "W4A16_AWQ",
1690
+ "group_size": 128,
1691
+ "has_zero_point": false,
1692
+ "pre_quant_scale": true
1693
+ },
1694
+ "transformer.layers.64.attention.dense": {
1695
+ "quant_algo": "W4A8_AWQ",
1696
+ "group_size": 128,
1697
+ "has_zero_point": false,
1698
+ "pre_quant_scale": true
1699
+ },
1700
+ "transformer.layers.64.mlp.fc": {
1701
+ "quant_algo": "W4A16_AWQ",
1702
+ "group_size": 128,
1703
+ "has_zero_point": false,
1704
+ "pre_quant_scale": true
1705
+ },
1706
+ "transformer.layers.64.mlp.gate": {
1707
+ "quant_algo": "W4A16_AWQ",
1708
+ "group_size": 128,
1709
+ "has_zero_point": false,
1710
+ "pre_quant_scale": true
1711
+ },
1712
+ "transformer.layers.64.mlp.proj": {
1713
+ "quant_algo": "W4A8_AWQ",
1714
+ "group_size": 128,
1715
+ "has_zero_point": false,
1716
+ "pre_quant_scale": true
1717
+ },
1718
+ "transformer.layers.65.attention.qkv": {
1719
+ "quant_algo": "W4A8_AWQ",
1720
+ "group_size": 128,
1721
+ "has_zero_point": false,
1722
+ "pre_quant_scale": true
1723
+ },
1724
+ "transformer.layers.65.attention.dense": {
1725
+ "quant_algo": "W4A16_AWQ",
1726
+ "group_size": 128,
1727
+ "has_zero_point": false,
1728
+ "pre_quant_scale": true
1729
+ },
1730
+ "transformer.layers.65.mlp.fc": {
1731
+ "quant_algo": "W4A16_AWQ",
1732
+ "group_size": 128,
1733
+ "has_zero_point": false,
1734
+ "pre_quant_scale": true
1735
+ },
1736
+ "transformer.layers.65.mlp.gate": {
1737
+ "quant_algo": "W4A16_AWQ",
1738
+ "group_size": 128,
1739
+ "has_zero_point": false,
1740
+ "pre_quant_scale": true
1741
+ },
1742
+ "transformer.layers.65.mlp.proj": {
1743
+ "quant_algo": "W4A8_AWQ",
1744
+ "group_size": 128,
1745
+ "has_zero_point": false,
1746
+ "pre_quant_scale": true
1747
+ },
1748
+ "transformer.layers.66.attention.qkv": {
1749
+ "quant_algo": "W4A8_AWQ",
1750
+ "group_size": 128,
1751
+ "has_zero_point": false,
1752
+ "pre_quant_scale": true
1753
+ },
1754
+ "transformer.layers.66.attention.dense": {
1755
+ "quant_algo": "W4A16_AWQ",
1756
+ "group_size": 128,
1757
+ "has_zero_point": false,
1758
+ "pre_quant_scale": true
1759
+ },
1760
+ "transformer.layers.66.mlp.fc": {
1761
+ "quant_algo": "W4A8_AWQ",
1762
+ "group_size": 128,
1763
+ "has_zero_point": false,
1764
+ "pre_quant_scale": true
1765
+ },
1766
+ "transformer.layers.66.mlp.gate": {
1767
+ "quant_algo": "W4A8_AWQ",
1768
+ "group_size": 128,
1769
+ "has_zero_point": false,
1770
+ "pre_quant_scale": true
1771
+ },
1772
+ "transformer.layers.66.mlp.proj": {
1773
+ "quant_algo": "W4A8_AWQ",
1774
+ "group_size": 128,
1775
+ "has_zero_point": false,
1776
+ "pre_quant_scale": true
1777
+ },
1778
+ "transformer.layers.67.attention.qkv": {
1779
+ "quant_algo": "W4A8_AWQ",
1780
+ "group_size": 128,
1781
+ "has_zero_point": false,
1782
+ "pre_quant_scale": true
1783
+ },
1784
+ "transformer.layers.67.attention.dense": {
1785
+ "quant_algo": "W4A16_AWQ",
1786
+ "group_size": 128,
1787
+ "has_zero_point": false,
1788
+ "pre_quant_scale": true
1789
+ },
1790
+ "transformer.layers.67.mlp.fc": {
1791
+ "quant_algo": "W4A8_AWQ",
1792
+ "group_size": 128,
1793
+ "has_zero_point": false,
1794
+ "pre_quant_scale": true
1795
+ },
1796
+ "transformer.layers.67.mlp.gate": {
1797
+ "quant_algo": "W4A8_AWQ",
1798
+ "group_size": 128,
1799
+ "has_zero_point": false,
1800
+ "pre_quant_scale": true
1801
+ },
1802
+ "transformer.layers.67.mlp.proj": {
1803
+ "quant_algo": "W4A8_AWQ",
1804
+ "group_size": 128,
1805
+ "has_zero_point": false,
1806
+ "pre_quant_scale": true
1807
+ },
1808
+ "transformer.layers.68.attention.qkv": {
1809
+ "quant_algo": "W4A8_AWQ",
1810
+ "group_size": 128,
1811
+ "has_zero_point": false,
1812
+ "pre_quant_scale": true
1813
+ },
1814
+ "transformer.layers.68.attention.dense": {
1815
+ "quant_algo": "W4A8_AWQ",
1816
+ "group_size": 128,
1817
+ "has_zero_point": false,
1818
+ "pre_quant_scale": true
1819
+ },
1820
+ "transformer.layers.68.mlp.fc": {
1821
+ "quant_algo": "W4A8_AWQ",
1822
+ "group_size": 128,
1823
+ "has_zero_point": false,
1824
+ "pre_quant_scale": true
1825
+ },
1826
+ "transformer.layers.68.mlp.gate": {
1827
+ "quant_algo": "W4A8_AWQ",
1828
+ "group_size": 128,
1829
+ "has_zero_point": false,
1830
+ "pre_quant_scale": true
1831
+ },
1832
+ "transformer.layers.68.mlp.proj": {
1833
+ "quant_algo": "W4A16_AWQ",
1834
+ "group_size": 128,
1835
+ "has_zero_point": false,
1836
+ "pre_quant_scale": true
1837
+ },
1838
+ "transformer.layers.69.attention.qkv": {
1839
+ "quant_algo": "W4A16_AWQ",
1840
+ "group_size": 128,
1841
+ "has_zero_point": false,
1842
+ "pre_quant_scale": true
1843
+ },
1844
+ "transformer.layers.69.attention.dense": {
1845
+ "quant_algo": "W4A16_AWQ",
1846
+ "group_size": 128,
1847
+ "has_zero_point": false,
1848
+ "pre_quant_scale": true
1849
+ },
1850
+ "transformer.layers.69.mlp.fc": {
1851
+ "quant_algo": "W4A8_AWQ",
1852
+ "group_size": 128,
1853
+ "has_zero_point": false,
1854
+ "pre_quant_scale": true
1855
+ },
1856
+ "transformer.layers.69.mlp.gate": {
1857
+ "quant_algo": "W4A8_AWQ",
1858
+ "group_size": 128,
1859
+ "has_zero_point": false,
1860
+ "pre_quant_scale": true
1861
+ },
1862
+ "transformer.layers.69.mlp.proj": {
1863
+ "quant_algo": "W4A16_AWQ",
1864
+ "group_size": 128,
1865
+ "has_zero_point": false,
1866
+ "pre_quant_scale": true
1867
+ },
1868
+ "transformer.layers.70.attention.qkv": {
1869
+ "quant_algo": "W4A8_AWQ",
1870
+ "group_size": 128,
1871
+ "has_zero_point": false,
1872
+ "pre_quant_scale": true
1873
+ },
1874
+ "transformer.layers.70.attention.dense": {
1875
+ "quant_algo": "W4A16_AWQ",
1876
+ "group_size": 128,
1877
+ "has_zero_point": false,
1878
+ "pre_quant_scale": true
1879
+ },
1880
+ "transformer.layers.70.mlp.fc": {
1881
+ "quant_algo": "W4A16_AWQ",
1882
+ "group_size": 128,
1883
+ "has_zero_point": false,
1884
+ "pre_quant_scale": true
1885
+ },
1886
+ "transformer.layers.70.mlp.gate": {
1887
+ "quant_algo": "W4A16_AWQ",
1888
+ "group_size": 128,
1889
+ "has_zero_point": false,
1890
+ "pre_quant_scale": true
1891
+ },
1892
+ "transformer.layers.70.mlp.proj": {
1893
+ "quant_algo": "W4A16_AWQ",
1894
+ "group_size": 128,
1895
+ "has_zero_point": false,
1896
+ "pre_quant_scale": true
1897
+ },
1898
+ "transformer.layers.71.attention.qkv": {
1899
+ "quant_algo": "W4A16_AWQ",
1900
+ "group_size": 128,
1901
+ "has_zero_point": false,
1902
+ "pre_quant_scale": true
1903
+ },
1904
+ "transformer.layers.71.attention.dense": {
1905
+ "quant_algo": "W4A16_AWQ",
1906
+ "group_size": 128,
1907
+ "has_zero_point": false,
1908
+ "pre_quant_scale": true
1909
+ },
1910
+ "transformer.layers.71.mlp.fc": {
1911
+ "quant_algo": "W4A8_AWQ",
1912
+ "group_size": 128,
1913
+ "has_zero_point": false,
1914
+ "pre_quant_scale": true
1915
+ },
1916
+ "transformer.layers.71.mlp.gate": {
1917
+ "quant_algo": "W4A8_AWQ",
1918
+ "group_size": 128,
1919
+ "has_zero_point": false,
1920
+ "pre_quant_scale": true
1921
+ },
1922
+ "transformer.layers.71.mlp.proj": {
1923
+ "quant_algo": "W4A16_AWQ",
1924
+ "group_size": 128,
1925
+ "has_zero_point": false,
1926
+ "pre_quant_scale": true
1927
+ },
1928
+ "transformer.layers.72.attention.qkv": {
1929
+ "quant_algo": "W4A16_AWQ",
1930
+ "group_size": 128,
1931
+ "has_zero_point": false,
1932
+ "pre_quant_scale": true
1933
+ },
1934
+ "transformer.layers.72.attention.dense": {
1935
+ "quant_algo": "W4A8_AWQ",
1936
+ "group_size": 128,
1937
+ "has_zero_point": false,
1938
+ "pre_quant_scale": true
1939
+ },
1940
+ "transformer.layers.72.mlp.fc": {
1941
+ "quant_algo": "W4A16_AWQ",
1942
+ "group_size": 128,
1943
+ "has_zero_point": false,
1944
+ "pre_quant_scale": true
1945
+ },
1946
+ "transformer.layers.72.mlp.gate": {
1947
+ "quant_algo": "W4A16_AWQ",
1948
+ "group_size": 128,
1949
+ "has_zero_point": false,
1950
+ "pre_quant_scale": true
1951
+ },
1952
+ "transformer.layers.72.mlp.proj": {
1953
+ "quant_algo": "W4A8_AWQ",
1954
+ "group_size": 128,
1955
+ "has_zero_point": false,
1956
+ "pre_quant_scale": true
1957
+ },
1958
+ "transformer.layers.73.attention.qkv": {
1959
+ "quant_algo": "W4A8_AWQ",
1960
+ "group_size": 128,
1961
+ "has_zero_point": false,
1962
+ "pre_quant_scale": true
1963
+ },
1964
+ "transformer.layers.73.attention.dense": {
1965
+ "quant_algo": "W4A8_AWQ",
1966
+ "group_size": 128,
1967
+ "has_zero_point": false,
1968
+ "pre_quant_scale": true
1969
+ },
1970
+ "transformer.layers.73.mlp.fc": {
1971
+ "quant_algo": "W4A16_AWQ",
1972
+ "group_size": 128,
1973
+ "has_zero_point": false,
1974
+ "pre_quant_scale": true
1975
+ },
1976
+ "transformer.layers.73.mlp.gate": {
1977
+ "quant_algo": "W4A16_AWQ",
1978
+ "group_size": 128,
1979
+ "has_zero_point": false,
1980
+ "pre_quant_scale": true
1981
+ },
1982
+ "transformer.layers.73.mlp.proj": {
1983
+ "quant_algo": "W4A16_AWQ",
1984
+ "group_size": 128,
1985
+ "has_zero_point": false,
1986
+ "pre_quant_scale": true
1987
+ },
1988
+ "transformer.layers.74.attention.qkv": {
1989
+ "quant_algo": "W4A16_AWQ",
1990
+ "group_size": 128,
1991
+ "has_zero_point": false,
1992
+ "pre_quant_scale": true
1993
+ },
1994
+ "transformer.layers.74.attention.dense": {
1995
+ "quant_algo": "W4A8_AWQ",
1996
+ "group_size": 128,
1997
+ "has_zero_point": false,
1998
+ "pre_quant_scale": true
1999
+ },
2000
+ "transformer.layers.74.mlp.fc": {
2001
+ "quant_algo": "W4A16_AWQ",
2002
+ "group_size": 128,
2003
+ "has_zero_point": false,
2004
+ "pre_quant_scale": true
2005
+ },
2006
+ "transformer.layers.74.mlp.gate": {
2007
+ "quant_algo": "W4A16_AWQ",
2008
+ "group_size": 128,
2009
+ "has_zero_point": false,
2010
+ "pre_quant_scale": true
2011
+ },
2012
+ "transformer.layers.74.mlp.proj": {
2013
+ "quant_algo": "W4A8_AWQ",
2014
+ "group_size": 128,
2015
+ "has_zero_point": false,
2016
+ "pre_quant_scale": true
2017
+ },
2018
+ "transformer.layers.75.attention.qkv": {
2019
+ "quant_algo": "W4A16_AWQ",
2020
+ "group_size": 128,
2021
+ "has_zero_point": false,
2022
+ "pre_quant_scale": true
2023
+ },
2024
+ "transformer.layers.75.attention.dense": {
2025
+ "quant_algo": "W4A8_AWQ",
2026
+ "group_size": 128,
2027
+ "has_zero_point": false,
2028
+ "pre_quant_scale": true
2029
+ },
2030
+ "transformer.layers.75.mlp.fc": {
2031
+ "quant_algo": "W4A16_AWQ",
2032
+ "group_size": 128,
2033
+ "has_zero_point": false,
2034
+ "pre_quant_scale": true
2035
+ },
2036
+ "transformer.layers.75.mlp.gate": {
2037
+ "quant_algo": "W4A16_AWQ",
2038
+ "group_size": 128,
2039
+ "has_zero_point": false,
2040
+ "pre_quant_scale": true
2041
+ },
2042
+ "transformer.layers.75.mlp.proj": {
2043
+ "quant_algo": "W4A8_AWQ",
2044
+ "group_size": 128,
2045
+ "has_zero_point": false,
2046
+ "pre_quant_scale": true
2047
+ },
2048
+ "transformer.layers.76.attention.qkv": {
2049
+ "quant_algo": "FP8"
2050
+ },
2051
+ "transformer.layers.76.attention.dense": {
2052
+ "quant_algo": "W4A16_AWQ",
2053
+ "group_size": 128,
2054
+ "has_zero_point": false,
2055
+ "pre_quant_scale": true
2056
+ },
2057
+ "transformer.layers.76.mlp.fc": {
2058
+ "quant_algo": "W4A16_AWQ",
2059
+ "group_size": 128,
2060
+ "has_zero_point": false,
2061
+ "pre_quant_scale": true
2062
+ },
2063
+ "transformer.layers.76.mlp.gate": {
2064
+ "quant_algo": "W4A16_AWQ",
2065
+ "group_size": 128,
2066
+ "has_zero_point": false,
2067
+ "pre_quant_scale": true
2068
+ },
2069
+ "transformer.layers.76.mlp.proj": {
2070
+ "quant_algo": "W4A8_AWQ",
2071
+ "group_size": 128,
2072
+ "has_zero_point": false,
2073
+ "pre_quant_scale": true
2074
+ },
2075
+ "transformer.layers.77.attention.qkv": {
2076
+ "quant_algo": "W4A8_AWQ",
2077
+ "group_size": 128,
2078
+ "has_zero_point": false,
2079
+ "pre_quant_scale": true
2080
+ },
2081
+ "transformer.layers.77.attention.dense": {
2082
+ "quant_algo": "W4A16_AWQ",
2083
+ "group_size": 128,
2084
+ "has_zero_point": false,
2085
+ "pre_quant_scale": true
2086
+ },
2087
+ "transformer.layers.77.mlp.fc": {
2088
+ "quant_algo": "W4A8_AWQ",
2089
+ "group_size": 128,
2090
+ "has_zero_point": false,
2091
+ "pre_quant_scale": true
2092
+ },
2093
+ "transformer.layers.77.mlp.gate": {
2094
+ "quant_algo": "W4A8_AWQ",
2095
+ "group_size": 128,
2096
+ "has_zero_point": false,
2097
+ "pre_quant_scale": true
2098
+ },
2099
+ "transformer.layers.77.mlp.proj": {
2100
+ "quant_algo": "W4A8_AWQ",
2101
+ "group_size": 128,
2102
+ "has_zero_point": false,
2103
+ "pre_quant_scale": true
2104
+ },
2105
+ "transformer.layers.78.attention.qkv": {
2106
+ "quant_algo": "W4A8_AWQ",
2107
+ "group_size": 128,
2108
+ "has_zero_point": false,
2109
+ "pre_quant_scale": true
2110
+ },
2111
+ "transformer.layers.78.attention.dense": {
2112
+ "quant_algo": "W4A16_AWQ",
2113
+ "group_size": 128,
2114
+ "has_zero_point": false,
2115
+ "pre_quant_scale": true
2116
+ },
2117
+ "transformer.layers.78.mlp.fc": {
2118
+ "quant_algo": "FP8"
2119
+ },
2120
+ "transformer.layers.78.mlp.gate": {
2121
+ "quant_algo": "FP8"
2122
+ },
2123
+ "transformer.layers.78.mlp.proj": {
2124
+ "quant_algo": "W4A16_AWQ",
2125
+ "group_size": 128,
2126
+ "has_zero_point": false,
2127
+ "pre_quant_scale": true
2128
+ },
2129
+ "transformer.layers.79.attention.qkv": {
2130
+ "quant_algo": "W4A8_AWQ",
2131
+ "group_size": 128,
2132
+ "has_zero_point": false,
2133
+ "pre_quant_scale": true
2134
+ },
2135
+ "transformer.layers.79.attention.dense": {
2136
+ "quant_algo": "W4A16_AWQ",
2137
+ "group_size": 128,
2138
+ "has_zero_point": false,
2139
+ "pre_quant_scale": true
2140
+ },
2141
+ "transformer.layers.79.mlp.fc": {
2142
+ "quant_algo": "FP8"
2143
+ },
2144
+ "transformer.layers.79.mlp.gate": {
2145
+ "quant_algo": "FP8"
2146
+ },
2147
+ "transformer.layers.79.mlp.proj": {
2148
+ "quant_algo": "W4A16_AWQ",
2149
+ "group_size": 128,
2150
+ "has_zero_point": false,
2151
+ "pre_quant_scale": true
2152
+ },
2153
+ "transformer.layers.80.attention.qkv": {
2154
+ "quant_algo": "W4A8_AWQ",
2155
+ "group_size": 128,
2156
+ "has_zero_point": false,
2157
+ "pre_quant_scale": true
2158
+ },
2159
+ "transformer.layers.80.attention.dense": {
2160
+ "quant_algo": "W4A8_AWQ",
2161
+ "group_size": 128,
2162
+ "has_zero_point": false,
2163
+ "pre_quant_scale": true
2164
+ },
2165
+ "transformer.layers.80.mlp.fc": {
2166
+ "quant_algo": "FP8"
2167
+ },
2168
+ "transformer.layers.80.mlp.gate": {
2169
+ "quant_algo": "FP8"
2170
+ },
2171
+ "transformer.layers.80.mlp.proj": {
2172
+ "quant_algo": "W4A8_AWQ",
2173
+ "group_size": 128,
2174
+ "has_zero_point": false,
2175
+ "pre_quant_scale": true
2176
+ },
2177
+ "transformer.layers.81.attention.qkv": {
2178
+ "quant_algo": "W4A8_AWQ",
2179
+ "group_size": 128,
2180
+ "has_zero_point": false,
2181
+ "pre_quant_scale": true
2182
+ },
2183
+ "transformer.layers.81.attention.dense": {
2184
+ "quant_algo": "W4A16_AWQ",
2185
+ "group_size": 128,
2186
+ "has_zero_point": false,
2187
+ "pre_quant_scale": true
2188
+ },
2189
+ "transformer.layers.81.mlp.fc": {
2190
+ "quant_algo": "FP8"
2191
+ },
2192
+ "transformer.layers.81.mlp.gate": {
2193
+ "quant_algo": "FP8"
2194
+ },
2195
+ "transformer.layers.81.mlp.proj": {
2196
+ "quant_algo": "W4A16_AWQ",
2197
+ "group_size": 128,
2198
+ "has_zero_point": false,
2199
+ "pre_quant_scale": true
2200
+ },
2201
+ "transformer.layers.82.attention.qkv": {
2202
+ "quant_algo": "W4A16_AWQ",
2203
+ "group_size": 128,
2204
+ "has_zero_point": false,
2205
+ "pre_quant_scale": true
2206
+ },
2207
+ "transformer.layers.82.attention.dense": {
2208
+ "quant_algo": "W4A16_AWQ",
2209
+ "group_size": 128,
2210
+ "has_zero_point": false,
2211
+ "pre_quant_scale": true
2212
+ },
2213
+ "transformer.layers.82.mlp.fc": {
2214
+ "quant_algo": "FP8"
2215
+ },
2216
+ "transformer.layers.82.mlp.gate": {
2217
+ "quant_algo": "FP8"
2218
+ },
2219
+ "transformer.layers.82.mlp.proj": {
2220
+ "quant_algo": "W4A8_AWQ",
2221
+ "group_size": 128,
2222
+ "has_zero_point": false,
2223
+ "pre_quant_scale": true
2224
+ },
2225
+ "transformer.layers.83.attention.qkv": {
2226
+ "quant_algo": "W4A16_AWQ",
2227
+ "group_size": 128,
2228
+ "has_zero_point": false,
2229
+ "pre_quant_scale": true
2230
+ },
2231
+ "transformer.layers.83.attention.dense": {
2232
+ "quant_algo": "W4A16_AWQ",
2233
+ "group_size": 128,
2234
+ "has_zero_point": false,
2235
+ "pre_quant_scale": true
2236
+ },
2237
+ "transformer.layers.83.mlp.fc": {
2238
+ "quant_algo": "FP8"
2239
+ },
2240
+ "transformer.layers.83.mlp.gate": {
2241
+ "quant_algo": "FP8"
2242
+ },
2243
+ "transformer.layers.83.mlp.proj": {
2244
+ "quant_algo": "W4A8_AWQ",
2245
+ "group_size": 128,
2246
+ "has_zero_point": false,
2247
+ "pre_quant_scale": true
2248
+ },
2249
+ "transformer.layers.84.attention.qkv": {
2250
+ "quant_algo": "W4A8_AWQ",
2251
+ "group_size": 128,
2252
+ "has_zero_point": false,
2253
+ "pre_quant_scale": true
2254
+ },
2255
+ "transformer.layers.84.attention.dense": {
2256
+ "quant_algo": "W4A16_AWQ",
2257
+ "group_size": 128,
2258
+ "has_zero_point": false,
2259
+ "pre_quant_scale": true
2260
+ },
2261
+ "transformer.layers.84.mlp.fc": {
2262
+ "quant_algo": "FP8"
2263
+ },
2264
+ "transformer.layers.84.mlp.gate": {
2265
+ "quant_algo": "FP8"
2266
+ },
2267
+ "transformer.layers.84.mlp.proj": {
2268
+ "quant_algo": "W4A8_AWQ",
2269
+ "group_size": 128,
2270
+ "has_zero_point": false,
2271
+ "pre_quant_scale": true
2272
+ },
2273
+ "transformer.layers.85.attention.qkv": {
2274
+ "quant_algo": "W4A8_AWQ",
2275
+ "group_size": 128,
2276
+ "has_zero_point": false,
2277
+ "pre_quant_scale": true
2278
+ },
2279
+ "transformer.layers.85.attention.dense": {
2280
+ "quant_algo": "W4A16_AWQ",
2281
+ "group_size": 128,
2282
+ "has_zero_point": false,
2283
+ "pre_quant_scale": true
2284
+ },
2285
+ "transformer.layers.85.mlp.fc": {
2286
+ "quant_algo": "FP8"
2287
+ },
2288
+ "transformer.layers.85.mlp.gate": {
2289
+ "quant_algo": "FP8"
2290
+ },
2291
+ "transformer.layers.85.mlp.proj": {
2292
+ "quant_algo": "W4A8_AWQ",
2293
+ "group_size": 128,
2294
+ "has_zero_point": false,
2295
+ "pre_quant_scale": true
2296
+ },
2297
+ "transformer.layers.86.attention.qkv": {
2298
+ "quant_algo": "W4A8_AWQ",
2299
+ "group_size": 128,
2300
+ "has_zero_point": false,
2301
+ "pre_quant_scale": true
2302
+ },
2303
+ "transformer.layers.86.attention.dense": {
2304
+ "quant_algo": "W4A16_AWQ",
2305
+ "group_size": 128,
2306
+ "has_zero_point": false,
2307
+ "pre_quant_scale": true
2308
+ },
2309
+ "transformer.layers.86.mlp.fc": {
2310
+ "quant_algo": "FP8"
2311
+ },
2312
+ "transformer.layers.86.mlp.gate": {
2313
+ "quant_algo": "FP8"
2314
+ },
2315
+ "transformer.layers.86.mlp.proj": {
2316
+ "quant_algo": "FP8"
2317
+ },
2318
+ "transformer.layers.87.attention.qkv": {
2319
+ "quant_algo": "W4A8_AWQ",
2320
+ "group_size": 128,
2321
+ "has_zero_point": false,
2322
+ "pre_quant_scale": true
2323
+ },
2324
+ "transformer.layers.87.attention.dense": {
2325
+ "quant_algo": "W4A16_AWQ",
2326
+ "group_size": 128,
2327
+ "has_zero_point": false,
2328
+ "pre_quant_scale": true
2329
+ },
2330
+ "transformer.layers.87.mlp.fc": {
2331
+ "quant_algo": "FP8"
2332
+ },
2333
+ "transformer.layers.87.mlp.gate": {
2334
+ "quant_algo": "FP8"
2335
+ },
2336
+ "transformer.layers.87.mlp.proj": {
2337
+ "quant_algo": "FP8"
2338
+ }
2339
+ }
2340
+ }
rank0.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2f1ba0031fb0f2c1e10ee3bd0c27b7b6f5ed06245b04ba2a6897992487b16302
3
+ size 20011576704
rank1.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f793b4d2c394141dbc55fbf7b825b44672f9dd4756895c914209b0c2d9a818ba
3
+ size 20011576704
rank2.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e9e76de5bf0bbd1c6ac1425b506f02a491225858772d3c7b24d41000b9c2b98d
3
+ size 20011576704
rank3.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3a4f89419a5dc7e33ef3f2b9d97c6cd29ccbce4e83009bd25c39f6bcc2cbff6e
3
+ size 20011576704