File size: 694 Bytes
2724cd4 95292b8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
---
license: apache-2.0
datasets:
- EleutherAI/the_pile_deduplicated
language:
- en
---
Pythia-2.8B Deduped 4K is a [Pythia-2.8B Deduped](https://huggingface.co/EleutherAI/pythia-2.8b-deduped) model fine-tuned with a 4096 context length.
Training resumed from their 143,000 step checkpoint and continued on The Pile v1 Deduped (threshold=0.87).
This particular model is from a checkpoint captured at step 175,500 for an extra 134,217,728,000 tokens of training.
Note: Sequence length warmup was not used to move up from 2048 but, in hindsight, should have been applied.
## Acknoweldgements
This work would not have been possible without the support of [Stability AI](https://stability.ai/). |