mo137
/

Amethyst-13B-Mistral-2.2bpw-exl2

Text Generation

Text Generation

Not-For-All-Audiences

nsfw

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Amethyst 13B Mistral - EXL2 - 2.2 bpw

Model creator: Undi
Original model: Amethyst 13B Mistral

Description

2.2 bits per weight.
I think it's not very usable, seems rather nonsensical compared to 3 bpw.
I don't think exllamav2's current conversion script is able to convert to anything below ~2.18 bpw, at least not with the methods I tried.

I converted the model using the convert.py script from the exllamav2 repo:
https://github.com/turboderp/exllamav2
Its documentation:
https://github.com/turboderp/exllamav2/blob/master/doc/convert.md

I used the WikiText-2-v1 dataset for calibration:
https://huggingface.co/datasets/wikitext/blob/refs%2Fconvert%2Fparquet/wikitext-2-v1/test/0000.parquet

Downloads last month: 14

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.