4-bit HQQ quantized version of Meta-Llama-3.1-405B (base version). Quantization parameters: | |
nbits=2, group_size=128, quant_zero=True, quant_scale=True, axis=0 | |
Shards have been split with "split", to recombine: | |
cat qmodel_shard* > qmodel.pt |
4-bit HQQ quantized version of Meta-Llama-3.1-405B (base version). Quantization parameters: | |
nbits=2, group_size=128, quant_zero=True, quant_scale=True, axis=0 | |
Shards have been split with "split", to recombine: | |
cat qmodel_shard* > qmodel.pt |