YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)
pip install numpy gekko pandas

git clone https://github.com/PanQiWei/AutoGPTQ.git && cd AutoGPTQ

pip install -vvv --no-build-isolation -e .
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
from transformers import AutoTokenizer, TextStreamer

model = AutoGPTQForCausalLM.from_quantized(
    "GSJL/Qwen2.5-14B-Instruct-GPTQ-Marlin",
    use_marlin=True
    ).to("cuda:0")

tokenizer = AutoTokenizer.from_pretrained(save_dir, use_fast = True)
streamer = TextStreamer(tokenizer, skip_prompt = True, skip_special_tokens=True)

prompt = [{"role":"user","content":"Hi mom!!!!!"}]

inputs = tokenizer.apply_chat_template(
    prompt,
    return_tensors="pt",
    add_generation_prompt = True
).to("cuda:0")

output = model.generate(
    input_ids = inputs,
    streamer = streamer,
    use_cache=True,
    do_sample = True,
    max_new_tokens = 600
)
Downloads last month
3
Safetensors
Model size
3.31B params
Tensor type
I32
·
FP16
·
Inference API
Unable to determine this model's library. Check the docs .