This repository aims to explore the extreme compression ratio of the model, so only low bit quantization models are provided. They all quantized from F16.
model | size | ppl |
---|---|---|
F16 | 15G | 8.3662 +/- 0.06216 |
IQ2_M | 2.8G | 10.2360 +/- 0.07470 |
IQ2_S | 2.6G | 11.3735 +/- 0.08396 |
IQ2_XS | 2.5G | 12.3081 +/- 0.08961 |
IQ2_XXS | 2.3G | 15.9081 +/- 0.11701 |
IQ1_M | 2.1G | 26.5610 +/- 0.19391 |
- Downloads last month
- 18
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.