README.md · dranger003/Smaug-72B-v0.1-iMat.GGUF at c130721179b09cf2eb4cec56ee6be0045f3971f4

metadata

license: other
license_name: tongyi-qianwen-license-agreement
license_link: >-
  https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT
pipeline_tag: text-generation

GGUF importance matrix (imatrix) quants for https://huggingface.co/abacusai/Smaug-72B-v0.1
The importance matrix was trained for 100K tokens (200 batches of 512 tokens) using wiki.train.raw.

Update 2024-03-14:

New quant IQ1_S using latest commit 4755afd1.

Update 2024-03-02:

New quants IQ2_S/IQ2_M, requires commit a33e6a0d or later.
The importance matrix was trained for ~50K tokens (105 batches of 512 tokens) using a general purpose imatrix calibration dataset.
This is a different calibration dataset than the previous quants I posted so we can compare the quality

Llama-2 conversation template and system prompt set to the Qwen system prompt.

Layers	Context	Template
80	32768	[INST] <<SYS>> {instructions} <</SYS>> {prompt} [/INST] {response}