ggml versions of OpenLLaMa 7B
For use with llama.cpp.
- Version: 1T tokens final version
- Project: OpenLLaMA: An Open Reproduction of LLaMA
- Model: openlm-research/open_llama_7b
- llama.cpp 4,5,8-bit quantization: build 567(2d5db48) or later
- llama.cpp newer quantization formats: build 616(99009e7) or later
Perplexity
Calculated with llama.cpp, default settings (context 512, batch 512).
Test data: wiki.test.raw
of WikiText-103:
model | score |
---|---|
open-llama-7b-q2_K.bin | 8.5152 |
open-llama-7b-q3_K_S.bin | 7.6623 |
open-llama-7b-q3_K.bin | 7.3837 |
open-llama-7b-q3_K_L.bin | 7.3043 |
open-llama-7b-q4_0.bin | 7.2116 |
open-llama-7b-q4_1.bin | 7.1609 |
open-llama-7b-q4_K_S.bin | 7.1516 |
open-llama-7b-q4_K.bin | 7.1116 |
open-llama-7b-q5_0.bin | 7.0353 |
open-llama-7b-q5_K_S.bin | 7.0325 |
open-llama-7b-q5_1.bin | 7.0318 |
open-llama-7b-q5_K.bin | 7.0272 |
open-llama-7b-q6_K.bin | 7.0050 |
open-llama-7b-q8_0.bin | 6.9968 |
open-llama-7b-f16.bin | 6.9966 |