metadata

license: cc-by-nc-sa-4.0
tags:
  - generated_from_trainer
  - simplification
task_categories:
  - text2text-generation
task_ids:
  - text-simplification
language:
  - nl
datasets:
  - BramVanroy/chatgpt-dutch-simplification
metrics:
  - rouge
  - sari
model-index:
  - name: BramVanroy/ul2-large-dutch-simplification-mai-2023
    results:
      - task:
          type: text-simplification
          name: Text Simplification
        dataset:
          type: BramVanroy/chatgpt-dutch-simplification
          name: ChatGPT Dutch Simplification
        metrics:
          - type: rouge
            value: 41.3871
            name: Eval Rouge-1
          - type: rouge
            value: 19.6751
            name: Eval Rouge-2
          - type: rouge
            value: 36.0469
            name: Eval RougeL
          - type: rouge
            value: 36.1178
            name: Eval RougeLsum
          - type: sari
            value: 54.3588
            name: Eval SARI
          - type: rouge
            value: 43.8191
            name: Test Rouge-1
          - type: rouge
            value: 21.7783
            name: Test Rouge-2
          - type: rouge
            value: 39.3657
            name: Test RougeL
          - type: rouge
            value: 39.3751
            name: Test RougeLsum
          - type: sari
            value: 52.3752
            name: Test SARI
widget:
  - example_title: Cooking
    text: >-
      Op bepaalde tijdstippen verlang ik naar de smaakvolle culinaire creaties
      welke door de ambachtelijke expertise van mijn grootmoeder zijn
      vervaardigd.

ul2-large-dutch-simplification-mai-2023

This model is intended to simplify Dutch sentences.

This model is a fine-tuned version of yhavinga/ul2-large-dutch on the BramVanroy/chatgpt-dutch-simplification dataset.

The model was created in light of the master thesis of Charlotte Van de Velde in the Master of Science in Artificial Intelligence (MAI) at KU Leuven in 2023. Charlotte is supervised by Vincent Vandeghinste and Bram Vanroy. Dataset creation by Charlotte, model training by Bram.

Quick links

Repository: includes training code and model creation log
Dataset: BramVanroy/chatgpt-dutch-simplification
Parent model: this model was finetuned on yhavinga/ul2-large-dutch
Demo: shows the "base" model in action (don't rely on the "Hosted inference API" widget on this page, it does not work very well)

Intended uses & limitations, and dataset

The model is intended for sentence-level simplification of Dutch. It might extend to document-level simplification but most of the dataset is limited to sentences so document-level performance is not guaranteed.

The dataset has been generated automatically (cf. dataset description) and has not been manually verified. On top of that, this model has been fine-tuned and we did not scrutinize the parent model or its training data. Output of the current model is therefore subject to unexpected results (as most if not all neural networks).

Because the dataset was generated with ChatGPT, this model cannot be used for commercial purposes.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002927210895006501
train_batch_size: 32
optimizer: Adafactor
num_epochs: 27

These hyperarameters were found through Bayesian hyperparameter search with wandb. This is described in the repository.

Training results

eval results are on the evaluation set, predict results are on the test set. These were achieved with beam search (num_beams=3).

{
    "eval_gen_len": 21.404761904761905,
    "eval_loss": 3.0882697105407715,
    "eval_rouge1": 41.3871,
    "eval_rouge2": 19.6751,
    "eval_rougeL": 36.0469,
    "eval_rougeLsum": 36.1178,
    "eval_sari": 54.3588,
  
    "predict_gen_len": 22.1484375,
    "predict_loss": 2.7822625637054443,
    "predict_rouge1": 43.8191,
    "predict_rouge2": 21.7783,
    "predict_rougeL": 39.3657,
    "predict_rougeLsum": 39.3751,
    "predict_sari": 52.3752
}

Note: the model seems to underperform compared to the base variant of the model, achieving only similar results with a much larger size. The reason for this may be found in the hyperparameters, where this large model may have benefitted from a smaller learning rate in the optimisation space. In the hyperparameter search, the learning rate spectrum was set to 1e-03 to 1e-04 but this might be too large for this model and size.

Framework versions

Transformers 4.29.2
Pytorch 2.0.1+cu117
Datasets 2.12.0
Tokenizers 0.13.3