What settings are you using for this iMatrix quant?
#1
by
cosmojg
- opened
For example, what context size, chunk size, and calibration dataset are you using?
On this one, wikitext.train.raw (I always use it), ctx 32, chunks 3000.
Interesting! Have you tried experimenting with other datasets? For example, I can't help but wonder if you'd get better performance out of that 20k_random_data.txt file shared in the llama.cpp GitHub discussion thread about improved importance matrix calculations on near-random data.
No, I didn't.
But Kalomaze has an interrest into this : https://github.com/ggerganov/llama.cpp/discussions/5006
You probably read that thread considering what you mention.
Alas, I didn't test this dataset, because it seems for me that the wikitext.train.raw gets the gold medal in most cases for now.