--- license: mit --- This repository aims to explore the extreme compression ratio of the model, so only low bit quantization models are provided. They all quantized from F16. | model | size | ppl | | ------- | ---- | ------------------- | | F16 | 15G | 8.3662 +/- 0.06216 | | IQ2_M | 2.8G | 10.2360 +/- 0.07470 | | IQ2_S | 2.6G | 11.3735 +/- 0.08396 | | IQ2_XS | 2.5G | 12.3081 +/- 0.08961 | | IQ2_XXS | 2.3G | 15.9081 +/- 0.11701 | | IQ1_M | 2.1G | 26.5610 +/- 0.19391 | #