gsarti's picture
Create README.md
ec95e8a
|
raw
history blame
3.17 kB
metadata
language:
  - it
license: apache-2.0
tags:
  - italian
  - sequence-to-sequence
  - style-transfer
  - formality-style-transfer
datasets:
  - yahoo/xformal_it
widget:
  - text: Questa performance è a dir poco spiacevole.
  - text: >-
      In attesa di un Suo cortese riscontro, Le auguriamo un piacevole
      proseguimento di giornata.
  - text: Questa visione mi procura una goduria indescrivibile.
  - text: qualora ciò possa interessarti, ti pregherei di contattarmi.
metrics:
  - rouge
  - bertscore
model-index:
  - name: mt5-small-formal-to-informal
    results:
      - task:
          type: formality-style-transfer
          name: Formal-to-informal Style Transfer
        dataset:
          type: xformal_it
          name: XFORMAL (Italian Subset)
        metrics:
          - type: rouge1
            value: 0.857
            name: Avg. Test Rouge1
          - type: rouge2
            value: 0.771
            name: Avg. Test Rouge2
          - type: rougeL
            value: 0.854
            name: Avg. Test RougeL
          - type: bertscore
            value: 0.855
            name: Avg. Test BERTScore
            args:
              - model_type: dbmdz/bert-base-italian-xxl-uncased
              - lang: it
              - num_layers: 10
              - rescale_with_baseline: true
              - baseline_path: bertscore_baseline_ita.tsv
co2_eq_emissions:
  emissions: 17g
  source: Google Cloud Platform Carbon Footprint
  training_type: fine-tuning
  geographical_location: Eemshaven, Netherlands, Europe
  hardware_used: 1 TPU v3-8 VM

mT5 Small for Formal-to-informal Style Transfer 🤗

This repository contains the checkpoint for the mT5 Small model fine-tuned on Formal-to-informal style transfer on the Italian subset of the XFORMAL dataset as part of the experiments of the paper IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation by Gabriele Sarti and Malvina Nissim.

A comprehensive overview of other released materials is provided in the gsarti/it5 repository. Refer to the paper for additional details concerning the reported scores and the evaluation approach.

Using the model

Model checkpoints are available for usage in Tensorflow, Pytorch and JAX. They can be used directly with pipelines as:

from transformers import pipelines

f2i = pipeline("text2text-generation", model='it5/mt5-small-formal-to-informal')
f2i("Vi ringrazio infinitamente per vostra disponibilità")
>>> [{"generated_text": "e grazie per la vostra disponibilità!"}]

or loaded using autoclasses:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("it5/mt5-small-formal-to-informal")
model = AutoModelForSeq2SeqLM.from_pretrained("it5/mt5-small-formal-to-informal")

If you use this model in your research, please cite our work as:

@article{sarti-nissim-2022-it5,
    title={IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation},
    author={Sarti, Gabriele and Nissim, Malvina},
    journal={ArXiv preprint TBD},
    url={TBD},
    year={2022}
}