BramVanroy commited on
Commit
cf256fa
1 Parent(s): c5874d9

improve README

Browse files
Files changed (1) hide show
  1. README.md +15 -8
README.md CHANGED
@@ -72,18 +72,24 @@ dataset.
72
  The model was created in light of the master thesis of Charlotte Van de Velde in the Master of Science in Artificial
73
  Intelligence (MAI) at KU Leuven in 2023. Dataset creation by Charlotte, model training by Bram.
74
 
 
75
 
76
- ## Model description
 
 
77
 
78
- More information needed
79
 
80
- ## Intended uses & limitations
 
81
 
82
- More information needed
 
 
 
 
83
 
84
- ## Training and evaluation data
85
-
86
- More information needed
87
 
88
  ## Training procedure
89
 
@@ -93,9 +99,10 @@ The following hyperparameters were used during training:
93
  - learning_rate: 0.0006370158604635734
94
  - train_batch_size: 20
95
  - optimizer: Adafactor
96
- - lr_scheduler_type: linear
97
  - num_epochs: 37
98
 
 
 
99
 
100
  ### Training results
101
 
 
72
  The model was created in light of the master thesis of Charlotte Van de Velde in the Master of Science in Artificial
73
  Intelligence (MAI) at KU Leuven in 2023. Dataset creation by Charlotte, model training by Bram.
74
 
75
+ ## Quick links
76
 
77
+ - [Repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep): includes training code and model creation log
78
+ - [Dataset](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification): `BramVanroy/chatgpt-dutch-simplification`
79
+ - [Parent model](https://huggingface.co/yhavinga/ul2-small-dutch): this model was finetuned on `yhavinga/ul2-small-dutch`
80
 
81
+ ## Intended uses & limitations, and dataset
82
 
83
+ The model is intended for sentence-level simplification of Dutch. It might extend to document-level simplification
84
+ but most of the dataset is limited to sentences so document-level performance is not guaranteed.
85
 
86
+ The dataset has been generated automatically (cf.
87
+ [dataset description](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification)) and has not been
88
+ manually verified. On top of that, this model has been fine-tuned and we did not scrutinize the parent model or its
89
+ training data. Output of the current model is therefore subject to unexpected results (as most if not all neural
90
+ networks).
91
 
92
+ Because the dataset was generated with ChatGPT, this model cannot be used for commercial purposes.
 
 
93
 
94
  ## Training procedure
95
 
 
99
  - learning_rate: 0.0006370158604635734
100
  - train_batch_size: 20
101
  - optimizer: Adafactor
 
102
  - num_epochs: 37
103
 
104
+ These hyperarameters were found through Bayesian hyperparameter search with `wandb`. This is described in the
105
+ [repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep).
106
 
107
  ### Training results
108