gsarti
/

it5-small

@@ -15,7 +15,7 @@ thumbnail: https://gsarti.com/publication/it5/featured.png
 The [IT5](https://huggingface.co/models?search=it5) model family represents the first effort in pretraining large-scale sequence-to-sequence transformer models for the Italian language, following the approach adopted by the original [T5 model](https://github.com/google-research/text-to-text-transfer-transformer).
-This model is released as part of the project ["IT5: Large-Scale Text-to-Text Pretraining for Italian Language Understanding and Generation"](https://arxiv.org/abs/2203.03759), by [Gabriele Sarti](https://gsarti.com/) and [Malvina Nissim](https://malvinanissim.github.io/) with the support of [Huggingface](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104) and with TPU usage sponsored by Google's [TPU Research Cloud](https://sites.research.google/trc/). All the training was conducted on a single TPU3v8-VM machine on Google Cloud. Refer to the Tensorboard tab of the repository for an overview of the training process.
 *The inference widget is deactivated because the model needs a task-specific seq2seq fine-tuning on a downstream task to be useful in practice. The models in the  [`it5`](https://huggingface.co/it5) organization provide some examples of this model fine-tuned on various downstream task.*
@@ -77,12 +77,22 @@ For problems or updates on this model, please contact [[email protected]
 ##  Citation Information
 ```bibtex
-@article{sarti-nissim-2022-it5,
-    title={IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation},
-    author={Sarti, Gabriele and Nissim, Malvina},
-    journal={ArXiv preprint 2203.03759},
-    url={https://arxiv.org/abs/2203.03759},
-    year={2022},
-	month={mar}
 }
 ```

 The [IT5](https://huggingface.co/models?search=it5) model family represents the first effort in pretraining large-scale sequence-to-sequence transformer models for the Italian language, following the approach adopted by the original [T5 model](https://github.com/google-research/text-to-text-transfer-transformer).
+This model is released as part of the project ["IT5: Text-to-Text Pretraining for Italian Language Understanding and Generation"](https://aclanthology.org/2024.lrec-main.823/), by [Gabriele Sarti](https://gsarti.com/) and [Malvina Nissim](https://malvinanissim.github.io/) with the support of [Huggingface](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/7104) and with TPU usage sponsored by Google's [TPU Research Cloud](https://sites.research.google/trc/). All the training was conducted on a single TPU3v8-VM machine on Google Cloud. Refer to the Tensorboard tab of the repository for an overview of the training process.
 *The inference widget is deactivated because the model needs a task-specific seq2seq fine-tuning on a downstream task to be useful in practice. The models in the  [`it5`](https://huggingface.co/it5) organization provide some examples of this model fine-tuned on various downstream task.*
 ##  Citation Information
 ```bibtex
+@inproceedings{sarti-nissim-2024-it5-text,
+    title = "{IT}5: Text-to-text Pretraining for {I}talian Language Understanding and Generation",
+    author = "Sarti, Gabriele  and
+      Nissim, Malvina",
+    editor = "Calzolari, Nicoletta  and
+      Kan, Min-Yen  and
+      Hoste, Veronique  and
+      Lenci, Alessandro  and
+      Sakti, Sakriani  and
+      Xue, Nianwen",
+    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
+    month = may,
+    year = "2024",
+    address = "Torino, Italia",
+    publisher = "ELRA and ICCL",
+    url = "https://aclanthology.org/2024.lrec-main.823",
+    pages = "9422--9433",
 }
 ```