Perplexity scores for a Herd of 3B Llamas
#2
by
flyingkiwiguy
- opened
- Perplexities calculated using
build = 635 (5c64a09)
of llama.cpp and the first 406 lines of wiki.test.raw - Previous perplexity benchmarking for llamas indicated that 406 lines is enough to compare different sizes and quantization levels
406 lines would save me a lot of time. But 3B is pretty fast to compute.
406 lines would save me a lot of time. But 3B is pretty fast to compute.
I iterated through most of your models for the three context sizes to get a complete picture of how good Open LLama is for varying quantization levels. Still plenty of room for Open LLama to get to FB Llama quality.
EDIT: I posted the same plot and included the data for the 100 or so runs I did to https://github.com/openlm-research/open_llama/discussions/41
@flyingkiwiguy These graphs are great!
There are also people over in this discussion who may be interested in your graphs:
flyingkiwiguy
changed discussion title from
Perplexity scores for a Herd of 7B Llamas
to Perplexity scores for a Herd of 3B Llamas