--- inference: false license: mit tags: - text-generation - mamba - long context --- # DeciMamba Checkpoint (Baseline) The official checkpoint of Mamba-130m, finetuned for Language Modeling over the PG-19 dataset as presented in [DeciMamba: Exploring the Length Extrapolation Potential of Mamba](https://arxiv.org/abs/2406.14528). See our [Github Repo](https://github.com/assafbk/DeciMamba) for evalution and training scripts. Bibtex: ``` @misc{benkish2024decimambaexploringlengthextrapolation, title={DeciMamba: Exploring the Length Extrapolation Potential of Mamba}, author={Assaf Ben-Kish and Itamar Zimerman and Shady Abu-Hussein and Nadav Cohen and Amir Globerson and Lior Wolf and Raja Giryes}, year={2024}, eprint={2406.14528}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2406.14528}, } ```