|
--- |
|
language: |
|
- nl |
|
license: cc-by-nc-sa-4.0 |
|
tags: |
|
- generated_from_trainer |
|
- simplification |
|
datasets: |
|
- BramVanroy/chatgpt-dutch-simplification |
|
metrics: |
|
- rouge |
|
- sari |
|
task_categories: |
|
- text2text-generation |
|
task_ids: |
|
- text-simplification |
|
widget: |
|
- example_title: Cooking |
|
text: Op bepaalde tijdstippen verlang ik naar de smaakvolle culinaire creaties welke |
|
door de ambachtelijke expertise van mijn grootmoeder zijn vervaardigd. |
|
base_model: yhavinga/ul2-large-dutch |
|
model-index: |
|
- name: BramVanroy/ul2-large-dutch-simplification-mai-2023 |
|
results: |
|
- task: |
|
type: text-simplification |
|
name: Text Simplification |
|
dataset: |
|
name: ChatGPT Dutch Simplification |
|
type: BramVanroy/chatgpt-dutch-simplification |
|
metrics: |
|
- type: rouge |
|
value: 41.3871 |
|
name: Eval Rouge-1 |
|
- type: rouge |
|
value: 19.6751 |
|
name: Eval Rouge-2 |
|
- type: rouge |
|
value: 36.0469 |
|
name: Eval RougeL |
|
- type: rouge |
|
value: 36.1178 |
|
name: Eval RougeLsum |
|
- type: sari |
|
value: 54.3588 |
|
name: Eval SARI |
|
- type: rouge |
|
value: 43.8191 |
|
name: Test Rouge-1 |
|
- type: rouge |
|
value: 21.7783 |
|
name: Test Rouge-2 |
|
- type: rouge |
|
value: 39.3657 |
|
name: Test RougeL |
|
- type: rouge |
|
value: 39.3751 |
|
name: Test RougeLsum |
|
- type: sari |
|
value: 52.3752 |
|
name: Test SARI |
|
--- |
|
|
|
# ul2-large-dutch-simplification-mai-2023 |
|
|
|
This model is intended to simplify Dutch sentences. |
|
|
|
This model is a fine-tuned version of [yhavinga/ul2-large-dutch](https://huggingface.co/yhavinga/ul2-large-dutch) on |
|
the [BramVanroy/chatgpt-dutch-simplification](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification) |
|
dataset. |
|
|
|
The model was created in light of the master thesis of Charlotte Van de Velde in the Master of Science in Artificial |
|
Intelligence (MAI) at KU Leuven in 2023. Charlotte is supervised by Vincent Vandeghinste and Bram Vanroy. |
|
Dataset creation by Charlotte, model training by Bram. |
|
|
|
## Quick links |
|
|
|
- [Repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep): includes training code and model creation log |
|
- [Dataset](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification): `BramVanroy/chatgpt-dutch-simplification` |
|
- [Parent model](https://huggingface.co/yhavinga/ul2-large-dutch): this model was finetuned on `yhavinga/ul2-large-dutch` |
|
- [Demo](https://huggingface.co/spaces/BramVanroy/mai-simplification-nl-2023-demo): shows the "base" model in action (don't rely on the "Hosted inference API" widget on this page, it does not work very well) |
|
|
|
## Intended uses & limitations, and dataset |
|
|
|
The model is intended for sentence-level simplification of Dutch. It might extend to document-level simplification |
|
but most of the dataset is limited to sentences so document-level performance is not guaranteed. |
|
|
|
The dataset has been generated automatically (cf. |
|
[dataset description](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification)) and has not been |
|
manually verified. On top of that, this model has been fine-tuned and we did not scrutinize the parent model or its |
|
training data. Output of the current model is therefore subject to unexpected results (as most if not all neural |
|
networks). |
|
|
|
Because the dataset was generated with ChatGPT, this model cannot be used for commercial purposes. |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0002927210895006501 |
|
- train_batch_size: 32 |
|
- optimizer: Adafactor |
|
- num_epochs: 27 |
|
|
|
These hyperarameters were found through Bayesian hyperparameter search with `wandb`. This is described in the |
|
[repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep). |
|
|
|
### Training results |
|
|
|
`eval` results are on the evaluation set, `predict` results are on the test set. These were achieved with |
|
beam search (num_beams=3). |
|
|
|
```json |
|
{ |
|
"eval_gen_len": 21.404761904761905, |
|
"eval_loss": 3.0882697105407715, |
|
"eval_rouge1": 41.3871, |
|
"eval_rouge2": 19.6751, |
|
"eval_rougeL": 36.0469, |
|
"eval_rougeLsum": 36.1178, |
|
"eval_sari": 54.3588, |
|
|
|
"predict_gen_len": 22.1484375, |
|
"predict_loss": 2.7822625637054443, |
|
"predict_rouge1": 43.8191, |
|
"predict_rouge2": 21.7783, |
|
"predict_rougeL": 39.3657, |
|
"predict_rougeLsum": 39.3751, |
|
"predict_sari": 52.3752 |
|
} |
|
``` |
|
|
|
Note: the model seems to underperform compared to the |
|
[base variant](https://huggingface.co/BramVanroy/ul2-small-dutch-simplification-mai-2023) of the model, achieving only |
|
similar results with a much larger size. The reason for this may be found in the hyperparameters, where |
|
this large model may have benefitted from a smaller learning rate in the optimisation space. In the hyperparameter |
|
search, the learning rate spectrum was set to 1e-03 to 1e-04 but this might be too large for this model and size. |
|
|
|
### Framework versions |
|
|
|
- Transformers 4.29.2 |
|
- Pytorch 2.0.1+cu117 |
|
- Datasets 2.12.0 |
|
- Tokenizers 0.13.3 |
|
|