Which one is the fastest?

#1
by wawdili - opened

The score for Q8_0 is 1.35s/it, while the score for Q5_K_M is 1.77s/it. Q8-0 is faster.
If convenient, speed can be added as a reference. I'm not sure if it's because the ones with '_0' are faster, but it would be great if you could clearly label the speed.

did you fix the parameters? it involves random factors, even with the same prompt by default, i.e., if you draw a picture with more complicated elements (just like you draw a number from black box), then will take longer than normal process time; speed could be simply determined by file size by probability

did you fix the parameters? it involves random factors, even with the same prompt by default, i.e., if you dice a picture with more complicated elements, then will take longer than normal processing time; speed could be simply determined by file size by probability

I did not change any parameters except for U-net model loading, and fixed the seed. I tried several times and got the same result - Q8_0 is faster
Here is my computer configuration:
GPU - RTX2060 (6GB, overclocking)
CPU - i5 12400f
MEM - 16x2GB 3.2GHz
Also, I'm not sure what the phrase 'if you dice a picture with more complex elements' means. Does it mean' seed '? (Due to the translator)
I haven't tested other versions, but if Q5_K_M is faster on your computer, we may need more reports from people to draw conclusions.

which means random factor(s); dice probability; could try to generate few more times before jumping to the conclusion; model structure itself do affect the performance

which means random factor(s); dice probability; could try to generate few more times before jumping to the conclusion; model structure itself do affect the performance

Observation data:
Q8_0: 1.32s/it ~ 1.49s/it (median is 1.35s/it)
Q5_K_M: 1.68s/it ~ 2.83s/it (median 1.71s/it)

Q8_0 is 26.7% faster.

just a quick test with q4_0, q5_k_m and q8_0
q4_0_vs_q5_k_m_vs_q8_0.png

then do it with q3_k_s and run q8_0 again
q8_0_vs_q3_k_s_vs_q8_0.png

seems q8_0 is pretty much faster (disregard the memory buffer; fastest among them; with 4050); interesting

That's the answer.

wawdili changed discussion status to closed

Sign up or log in to comment