Update README.md
Browse files
README.md
CHANGED
@@ -52,6 +52,8 @@ Fietje was continue-pretrained on 28B Dutch tokens, which includes the full Dutc
|
|
52 |
|
53 |
I am thankful to the [Flemish Supercomputer Center](https://www.vscentrum.be/) (VSC) for providing the computational power to accomplish this project. Accounting for waiting for jobs, training took around two weeks on four nodes of 4x A100 80GB each (16 total).
|
54 |
|
|
|
|
|
55 |
### Training hyperparameters
|
56 |
|
57 |
The following hyperparameters were used during training:
|
|
|
52 |
|
53 |
I am thankful to the [Flemish Supercomputer Center](https://www.vscentrum.be/) (VSC) for providing the computational power to accomplish this project. Accounting for waiting for jobs, training took around two weeks on four nodes of 4x A100 80GB each (16 total).
|
54 |
|
55 |
+
Training was done with the wonderful [alignment-handbook](https://github.com/huggingface/alignment-handbook), using DeepSpeed as a back-end. Exact training recipes and SLURM script are given in the [Github repository](https://github.com/BramVanroy/fietje).
|
56 |
+
|
57 |
### Training hyperparameters
|
58 |
|
59 |
The following hyperparameters were used during training:
|