|
--- |
|
language: |
|
- en |
|
tags: |
|
- ggml |
|
- causal-lm |
|
- pythia |
|
license: apache-2.0 |
|
datasets: |
|
- EleutherAI/the_pile_deduplicated |
|
--- |
|
|
|
# Pythia Deduped Series GGML |
|
### This repository contains quantized conversions of EleutherAI's Pythia Deduped checkpoints. |
|
*For use with frontends that support GGML quantized GPT-NeoX models, such as KoboldCpp and Oobabooga (with the CTransformers loader).* |
|
|
|
*Last updated on 2023-05-25.* |
|
|
|
For other versions of the models, see here: |
|
- [GGMLv1 q4_3](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-04-20) (70M to 12B) |
|
- [GGMLv1 q5_0 / q5_1 / q8_0](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-04-30) (70M to 2.8B) |
|
- [GGMLv1 q4_0 / q4_2](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-05-06) (70M to 2.8B) |
|
- [GGMLv2 q4_0 / q5_1](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-05-15) (70M to 2.8B) |
|
- [GGMLv3 q4_0 / q5_1](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/main) |
|
|
|
|
|
# RAM USAGE |
|
Model | RAM usage |
|
:--:|:--: |
|
Unloaded | 41.3 MiB |
|
| |
|
ggmlv3-pythia-70m-deduped-q4_0.bin | 95.5 MiB |
|
ggmlv3-pythia-160m-deduped-q4_0.bin | 201.1 MiB |
|
ggmlv3-pythia-410m-deduped-q4_0.bin | 415.1 MiB |
|
ggmlv3-pythia-1b-deduped-q4_0.bin | 762.2 MiB |
|
ggmlv3-pythia-1.4b-deduped-q4_0.bin | 1.0 GiB |
|
ggmlv3-pythia-2.8b-deduped-q4_0.bin | 1.9 GiB |
|
| |
|
ggmlv3-pythia-70m-deduped-q5_1.bin | 108.7 MiB |
|
ggmlv3-pythia-160m-deduped-q5_1.bin | 226.9 MiB |
|
ggmlv3-pythia-410m-deduped-q5_1.bin | 494.0 MiB |
|
ggmlv3-pythia-1b-deduped-q5_1.bin | 943.9 MiB |
|
ggmlv3-pythia-1.4b-deduped-q5_1.bin | 1.3 GiB |
|
ggmlv3-pythia-2.8b-deduped-q5_1.bin | 2.3 GiB |
|
|
|
*Tested on KoboldCpp with OpenBLAS enabled.* |