CarperAI
/

pythia-2.8b-deduped-4k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

pythia-2.8b-deduped-4k / README.md

jon-tow's picture

Update README.md

95292b8 almost 2 years ago

|

694 Bytes

	---
	license: apache-2.0
	datasets:
	- EleutherAI/the_pile_deduplicated
	language:
	- en
	---

	Pythia-2.8B Deduped 4K is a [Pythia-2.8B Deduped](https://huggingface.co/EleutherAI/pythia-2.8b-deduped) model fine-tuned with a 4096 context length.
	Training resumed from their 143,000 step checkpoint and continued on The Pile v1 Deduped (threshold=0.87).
	This particular model is from a checkpoint captured at step 175,500 for an extra 134,217,728,000 tokens of training.

	Note: Sequence length warmup was not used to move up from 2048 but, in hindsight, should have been applied.

	## Acknoweldgements

	This work would not have been possible without the support of [Stability AI](https://stability.ai/).