Update README.md
Browse files
README.md
CHANGED
|
@@ -15,7 +15,7 @@ thumbnail: https://gsarti.com/publication/it5/featured.png
|
|
| 15 |
|
| 16 |
The [IT5](https://huggingface.co/models?search=it5) model family represents the first effort in pretraining large-scale sequence-to-sequence transformer models for the Italian language, following the approach adopted by the original [T5 model](https://github.com/google-research/text-to-text-transfer-transformer).
|
| 17 |
|
| 18 |
-
This model is released as part of the project ["IT5:
|
| 19 |
|
| 20 |
*The inference widget is deactivated because the model needs a task-specific seq2seq fine-tuning on a downstream task to be useful in practice. The models in the [`it5`](https://huggingface.co/it5) organization provide some examples of this model fine-tuned on various downstream task.*
|
| 21 |
|
|
@@ -77,12 +77,22 @@ For problems or updates on this model, please contact [[email protected]
|
|
| 77 |
## Citation Information
|
| 78 |
|
| 79 |
```bibtex
|
| 80 |
-
@
|
| 81 |
-
title={
|
| 82 |
-
author=
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 87 |
}
|
| 88 |
```
|
|
|
|
| 15 |
|
| 16 |
The [IT5](https://huggingface.co/models?search=it5) model family represents the first effort in pretraining large-scale sequence-to-sequence transformer models for the Italian language, following the approach adopted by the original [T5 model](https://github.com/google-research/text-to-text-transfer-transformer).
|
| 17 |
|
| 18 |
+
This model is released as part of the project ["IT5: Text-to-Text Pretraining for Italian Language Understanding and Generation"](https://aclanthology.org/2024.lrec-main.823/), by [Gabriele Sarti](https://gsarti.com/) and [Malvina Nissim](https://malvinanissim.github.io/) with the support of [Huggingface](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104) and with TPU usage sponsored by Google's [TPU Research Cloud](https://sites.research.google/trc/). All the training was conducted on a single TPU3v8-VM machine on Google Cloud. Refer to the Tensorboard tab of the repository for an overview of the training process.
|
| 19 |
|
| 20 |
*The inference widget is deactivated because the model needs a task-specific seq2seq fine-tuning on a downstream task to be useful in practice. The models in the [`it5`](https://huggingface.co/it5) organization provide some examples of this model fine-tuned on various downstream task.*
|
| 21 |
|
|
|
|
| 77 |
## Citation Information
|
| 78 |
|
| 79 |
```bibtex
|
| 80 |
+
@inproceedings{sarti-nissim-2024-it5-text,
|
| 81 |
+
title = "{IT}5: Text-to-text Pretraining for {I}talian Language Understanding and Generation",
|
| 82 |
+
author = "Sarti, Gabriele and
|
| 83 |
+
Nissim, Malvina",
|
| 84 |
+
editor = "Calzolari, Nicoletta and
|
| 85 |
+
Kan, Min-Yen and
|
| 86 |
+
Hoste, Veronique and
|
| 87 |
+
Lenci, Alessandro and
|
| 88 |
+
Sakti, Sakriani and
|
| 89 |
+
Xue, Nianwen",
|
| 90 |
+
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
|
| 91 |
+
month = may,
|
| 92 |
+
year = "2024",
|
| 93 |
+
address = "Torino, Italia",
|
| 94 |
+
publisher = "ELRA and ICCL",
|
| 95 |
+
url = "https://aclanthology.org/2024.lrec-main.823",
|
| 96 |
+
pages = "9422--9433",
|
| 97 |
}
|
| 98 |
```
|