Inclusively Rewriting model

This model is an Italian sequence-to-sequence model fine-tuned from the IT5-large for the task of inclusive language rewriting.

It has been trained to analyze and rewrite sentences in Italian to make them more inclusive (if needed).

For example, the sentence I professori devono essere preparati (The professors must be prepared) is rewritten as Il personale docente deve essere preparato (The teaching staff must be prepared).

Training data

The model has been trained on a dataset containing a total of 4705 pairs of sentences, each pair containing an inclusive and a non-inclusive sentence. The dataset has been split as follows:

  • Training set: 3764 pairs
  • Validation set: 470 pairs
  • Test set: 471 pairs

We also leverage a small set of synthetic data (generated using a set of rules) to improve the model's performance on the test set. The training is so performed on a total of 3764 + 75 = 3839 pairs.

The data collection has been manually annotated by experts in the field of inclusive language (dataset is not publicly available yet).

Training procedure

The model has been fine-tuned from the Italian BERT model using the following hyperparameters:

  • max_length: 128
  • batch_size: 8
  • learning_rate: 5e-5
  • warmup_steps: 500
  • epochs: 25 (best model is selected based on validation BLEU score)
  • optimizer: AdamW

Evaluation results

The model has been evaluated on the test set and obtained the following results:

Model BLEU ROUGE-2 F1 Human Correct Human Partial (L) Human Incorrect (L)
IT5 (no synth. data) 80.32 87.17 64.76 15.71 19.52
This 80.79 87.47 69.52 17.14 13.22

(L) in the metric indicates "Lower is better". The comparison with the same version of the model without synthetic data shows that the synthetic data is useful to improve the model's performance on the test set. Other comparisons can be found in the paper.

Citation

If you use this model, please make sure to cite the following papers:

Demo paper:


Main paper:


Downloads last month
184
Safetensors
Model size
783M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.