alyssavance commited on
Commit
46b6adf
1 Parent(s): 675bd81

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -0
README.md ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ 4-bit HQQ quantized version of Meta-Llama-3.1-405B (base version). Quantization parameters:
2
+
3
+ nbits=2, group_size=128, quant_zero=True, quant_scale=True, axis=0
4
+
5
+ Shards have been split with "split", to recombine:
6
+
7
+ cat qmodel_shard* > qmodel.pt