BramVanroy
commited on
Commit
•
cf256fa
1
Parent(s):
c5874d9
improve README
Browse files
README.md
CHANGED
@@ -72,18 +72,24 @@ dataset.
|
|
72 |
The model was created in light of the master thesis of Charlotte Van de Velde in the Master of Science in Artificial
|
73 |
Intelligence (MAI) at KU Leuven in 2023. Dataset creation by Charlotte, model training by Bram.
|
74 |
|
|
|
75 |
|
76 |
-
|
|
|
|
|
77 |
|
78 |
-
|
79 |
|
80 |
-
|
|
|
81 |
|
82 |
-
|
|
|
|
|
|
|
|
|
83 |
|
84 |
-
|
85 |
-
|
86 |
-
More information needed
|
87 |
|
88 |
## Training procedure
|
89 |
|
@@ -93,9 +99,10 @@ The following hyperparameters were used during training:
|
|
93 |
- learning_rate: 0.0006370158604635734
|
94 |
- train_batch_size: 20
|
95 |
- optimizer: Adafactor
|
96 |
-
- lr_scheduler_type: linear
|
97 |
- num_epochs: 37
|
98 |
|
|
|
|
|
99 |
|
100 |
### Training results
|
101 |
|
|
|
72 |
The model was created in light of the master thesis of Charlotte Van de Velde in the Master of Science in Artificial
|
73 |
Intelligence (MAI) at KU Leuven in 2023. Dataset creation by Charlotte, model training by Bram.
|
74 |
|
75 |
+
## Quick links
|
76 |
|
77 |
+
- [Repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep): includes training code and model creation log
|
78 |
+
- [Dataset](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification): `BramVanroy/chatgpt-dutch-simplification`
|
79 |
+
- [Parent model](https://huggingface.co/yhavinga/ul2-small-dutch): this model was finetuned on `yhavinga/ul2-small-dutch`
|
80 |
|
81 |
+
## Intended uses & limitations, and dataset
|
82 |
|
83 |
+
The model is intended for sentence-level simplification of Dutch. It might extend to document-level simplification
|
84 |
+
but most of the dataset is limited to sentences so document-level performance is not guaranteed.
|
85 |
|
86 |
+
The dataset has been generated automatically (cf.
|
87 |
+
[dataset description](https://huggingface.co/datasets/BramVanroy/chatgpt-dutch-simplification)) and has not been
|
88 |
+
manually verified. On top of that, this model has been fine-tuned and we did not scrutinize the parent model or its
|
89 |
+
training data. Output of the current model is therefore subject to unexpected results (as most if not all neural
|
90 |
+
networks).
|
91 |
|
92 |
+
Because the dataset was generated with ChatGPT, this model cannot be used for commercial purposes.
|
|
|
|
|
93 |
|
94 |
## Training procedure
|
95 |
|
|
|
99 |
- learning_rate: 0.0006370158604635734
|
100 |
- train_batch_size: 20
|
101 |
- optimizer: Adafactor
|
|
|
102 |
- num_epochs: 37
|
103 |
|
104 |
+
These hyperarameters were found through Bayesian hyperparameter search with `wandb`. This is described in the
|
105 |
+
[repository](https://github.com/BramVanroy/mai-simplification-nl-2023#22-hyperparameter-sweep).
|
106 |
|
107 |
### Training results
|
108 |
|