please include "-imat" in the repository title

by AaronFeng753 - opened Sep 20, 2024

Sep 20, 2024

Thank you for providing these GGUF models. However, I have been downloading your models for a long time without realizing that all of your GGUF models, including the KM ones, are iMatrix-calibrated GGUF models.

I acknowledge this is my mistake for not reading the model card carefully, but I believe it would be helpful if you included "iMat" in the title of your repository, as most other GGUF models are not iMat-calibrated.

I am concerned that iMatrix calibration may negatively impact the multilingual capabilities of the models, given that your calibration dataset contains only English materials.

Now, I have to re-download many GGUF models again...

bartowski

Owner Sep 21, 2024

For what it's worth, I have been testing my imatrix dataset against multilingual performance and found that it only improves still.

I used my imatrix and compared my Q4_K_M against a static one and found against Japanese wiki (for which my dataset contains 0 Japanese characters) my model performed closer to the original weights than the static.

So I don't think you have to go an redownload anything or use static quants, imatrix won't adversely affect multilingual because of the way the models work.

If you have a specific language concern I can test explicitly against it and show you the performance gains versus static

AaronFeng753

Sep 21, 2024

Thank you for this explanation!

AaronFeng753 changed discussion status to closed Sep 21, 2024

AaronFeng753

Sep 21, 2024

By the way, I noticed that iMat might negatively impact the performance of Qwen2.5 14B. You can check out my evaluation results here:

https://www.reddit.com/r/LocalLLaMA/comments/1flqwzw/qwen25_14b_gguf_quantization_evaluation_results/

As shown, K_L-iMat quantizations consistently perform worse than static K_M quantizations from Ollama.

This could be due to the model's smaller size and the fact that it's developed by a Chinese company, while your calibration dataset is mostly in English.

Of course, this is just my assumption. If you'd like to evaluate the Chineses language performance of your iMat quant against static quant, here’s the Chinese MMLU dataset:

https://huggingface.co/datasets/haonan-li/cmmlu

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment