Crataco
/

Pythia-Deduped-Series-GGML

Text Generation

Model card Files Files and versions Community

Pythia-Deduped-Series-GGML / README.md

Crataco's picture

Update README.md

338e28c over 1 year ago

|

1.83 kB

	---
	language:
	- en
	tags:
	- ggml
	- causal-lm
	- pythia
	license: apache-2.0
	datasets:
	- EleutherAI/the_pile_deduplicated
	---

	# Pythia Deduped Series GGML
	### This repository contains quantized conversions of EleutherAI's Pythia Deduped checkpoints.
	For use with frontends that support GGML quantized GPT-NeoX models, such as KoboldCpp and Oobabooga (with the CTransformers loader).

	Last updated on 2023-05-25.

	For other versions of the models, see here:
	- [GGMLv1 q4_3](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-04-20) (70M to 12B)
	- [GGMLv1 q5_0 / q5_1 / q8_0](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-04-30) (70M to 2.8B)
	- [GGMLv1 q4_0 / q4_2](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-05-06) (70M to 2.8B)
	- [GGMLv2 q4_0 / q5_1](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/a695a4c30c01ed9a41200c01f85d47c819fc93dd/2023-05-15) (70M to 2.8B)
	- [GGMLv3 q4_0 / q5_1](https://huggingface.co/Crataco/Pythia-Deduped-Series-GGML/tree/main)


	# RAM USAGE
	Model \| RAM usage
	:--:\|:--:
	Unloaded \| 41.3 MiB
	\|
	ggmlv3-pythia-70m-deduped-q4_0.bin \| 95.5 MiB
	ggmlv3-pythia-160m-deduped-q4_0.bin \| 201.1 MiB
	ggmlv3-pythia-410m-deduped-q4_0.bin \| 415.1 MiB
	ggmlv3-pythia-1b-deduped-q4_0.bin \| 762.2 MiB
	ggmlv3-pythia-1.4b-deduped-q4_0.bin \| 1.0 GiB
	ggmlv3-pythia-2.8b-deduped-q4_0.bin \| 1.9 GiB
	\|
	ggmlv3-pythia-70m-deduped-q5_1.bin \| 108.7 MiB
	ggmlv3-pythia-160m-deduped-q5_1.bin \| 226.9 MiB
	ggmlv3-pythia-410m-deduped-q5_1.bin \| 494.0 MiB
	ggmlv3-pythia-1b-deduped-q5_1.bin \| 943.9 MiB
	ggmlv3-pythia-1.4b-deduped-q5_1.bin \| 1.3 GiB
	ggmlv3-pythia-2.8b-deduped-q5_1.bin \| 2.3 GiB

	Tested on KoboldCpp with OpenBLAS enabled.