BramVanroy
/

ul2-small-dutch-simplification-mai-2023

Text2Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

BramVanroy commited on May 22, 2023

Commit

cf256fa

•

1 Parent(s): c5874d9

improve README

Files changed (1) hide show

README.md +15 -8

README.md CHANGED Viewed

@@ -72,18 +72,24 @@ dataset.
 The model was created in light of the master thesis of Charlotte Van de Velde in the Master of Science in Artificial
 Intelligence (MAI) at KU Leuven in 2023. Dataset creation by Charlotte, model training by Bram.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
@@ -93,9 +99,10 @@ The following hyperparameters were used during training:
 - learning_rate: 0.0006370158604635734
 - train_batch_size: 20
 - optimizer: Adafactor
-- lr_scheduler_type: linear
 - num_epochs: 37
 ### Training results

 The model was created in light of the master thesis of Charlotte Van de Velde in the Master of Science in Artificial
 Intelligence (MAI) at KU Leuven in 2023. Dataset creation by Charlotte, model training by Bram.
+## Quick links
+- [Repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep): includes training code and model creation log
+- [Dataset](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification): `BramVanroy/chatgpt-dutch-simplification`
+- [Parent model](https://huggingface.co/yhavinga/ul2-small-dutch): this model was finetuned on `yhavinga/ul2-small-dutch`
+## Intended uses & limitations, and dataset
+The model is intended for sentence-level simplification of Dutch. It might extend to document-level simplification
+but most of the dataset is limited to sentences so document-level performance is not guaranteed.
+The dataset has been generated automatically (cf.
+[dataset description](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification)) and has not been
+manually verified. On top of that, this model has been fine-tuned and we did not scrutinize the parent model or its
+training data. Output of the current model is therefore subject to unexpected results (as most if not all neural
+networks).
+Because the dataset was generated with ChatGPT, this model cannot be used for commercial purposes.
 ## Training procedure
 - learning_rate: 0.0006370158604635734
 - train_batch_size: 20
 - optimizer: Adafactor
 - num_epochs: 37
+These hyperarameters were found through Bayesian hyperparameter search with `wandb`. This is described in the
+[repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep).
 ### Training results