Llama-3.1-8B-Instruct-BioQA-AWQ

A quantized version of Llama-3.1-8B-Instruct. The model was quantized using AutoAWQ with biomedical question-answering (QA) data as calibration.

Key Details

  • Base Model: Llama-3.1-8B-Instruct
  • Calibration Data: Biomedical question-answering (QA)
  • Template: Official Llama chat format

Quantization Config

quant_config = {
    "zero_point": True,
    "q_group_size": 128,
    "w_bit": 4,
    "version": "GEMM"
}

License

The model follows the license of the base Llama-3.1-8B-Instruct model.

Downloads last month
66
Safetensors
Model size
1.98B params
Tensor type
I32
·
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for anthonyyazdaniml/Llama-3.1-8B-Instruct-BioQA-AWQ

Quantized
(334)
this model