nthngdy
/

headless-pythia-owt2-70m-ft

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nthngdy commited on Sep 26, 2023

Commit

aef9bec

•

1 Parent(s): 0c49c1f

Create README.md

Files changed (1) hide show

README.md +41 -0

README.md ADDED Viewed

	@@ -0,0 +1,41 @@

+---
+license: mit
+datasets:
+- the_pile_openwebtext2
+language:
+- en
+pipeline_tag: token-classification
+---
+### Model Sources
+<!-- Provide the basic links for the model. -->
+- **Repository:** https://github.com/NathanGodey/headless-lm
+- **Paper:** https://arxiv.org/abs/2309.08351
+### Model Architecture and Objective
+This model is a Pythia-70m architecture trained on OpenWebText-2 using the Contrastive Weight Tying objective, and briefly fine-tuned for language generation on the same dataset.
+## Citation
+**BibTeX:**
+```bibtex
+@misc{godey2023headless,
+      title={Headless Language Models: Learning without Predicting with Contrastive Weight Tying},
+      author={Nathan Godey and Éric de la Clergerie and Benoît Sagot},
+      year={2023},
+      eprint={2309.08351},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```
+## Contact
+[email protected]