Update README.md
Browse files
README.md
CHANGED
|
@@ -7,7 +7,7 @@ Zamba-7B-v1 is a hybrid model between Mamba, a state-space model, and transforme
|
|
| 7 |
|
| 8 |
Note: the current Huggingface implementation of Zamba performs slower than our internal implementation. We are working to fix this with the Huggingface team.
|
| 9 |
|
| 10 |
-
Our technical report describing the training of Zamba is available [here](https://arxiv.org/abs/2405.16712)
|
| 11 |
|
| 12 |
## Quick start
|
| 13 |
|
|
@@ -49,7 +49,8 @@ print(tokenizer.decode(outputs[0]))
|
|
| 49 |
|
| 50 |
If you find Zamba useful in your work please cite it as:
|
| 51 |
|
| 52 |
-
|
|
|
|
| 53 |
title={Zamba: A Compact 7B SSM Hybrid Model},
|
| 54 |
author={Glorioso, Paolo and Anthony, Quentin and Tokpanov, Yury and Whittington, James and Pilault, Jonathan and Ibrahim, Adam and Millidge, Beren},
|
| 55 |
journal={arXiv preprint arXiv:2405.16712},
|
|
|
|
| 7 |
|
| 8 |
Note: the current Huggingface implementation of Zamba performs slower than our internal implementation. We are working to fix this with the Huggingface team.
|
| 9 |
|
| 10 |
+
Our technical report describing the training of Zamba is available [here](https://arxiv.org/abs/2405.16712).
|
| 11 |
|
| 12 |
## Quick start
|
| 13 |
|
|
|
|
| 49 |
|
| 50 |
If you find Zamba useful in your work please cite it as:
|
| 51 |
|
| 52 |
+
```
|
| 53 |
+
@article{glorioso2024zamba,
|
| 54 |
title={Zamba: A Compact 7B SSM Hybrid Model},
|
| 55 |
author={Glorioso, Paolo and Anthony, Quentin and Tokpanov, Yury and Whittington, James and Pilault, Jonathan and Ibrahim, Adam and Millidge, Beren},
|
| 56 |
journal={arXiv preprint arXiv:2405.16712},
|