metadata
language: nl
widget:
- text: In het jaar 2030 zullen we
- text: Toen ik gisteren volledig in de ban was van
- text: >-
Studenten en leraren van de Bogazici Universiteit in de Turkse stad
Istanbul
- text: In Israël was een strenge lockdown
tags:
- gpt-neo-1.3B
- gpt-neo
pipeline_tag: text-generation
datasets:
- yhavinga/mc4_nl_cleaned
GPT Neo 1.3B pre-trained on cleaned Dutch mC4 🇳🇱
NB: Training in progress.
Dataset:
- mC4 NL Cleaned
- dataset config: tiny (3B tokens)
- dataset config: full (33B tokens)
Tokenizer:
- Tokenizer trained on mC4 with scripts from the Huggingface Transformers Flax examples
Training details:
- Trained for 70K steps (batch size 64) to ppl 27 on mc4 nl tiny 1 epoch
- Trained for 760K steps (batch size 16) to ppl 16.8 on mc4 nl full
- Training continuing
- Block size: 512
- Optimizer: adafactor
- lr: 5e-5
- Warmup steps: 5000
Work in progress. Jan 2022
- Many thanks to the Google TPU Research Cloud for providing access to a TPU cluster!
- Thanks to @gsarti for creating the t5-flax-gcp repository.
- Also thanks to the creators of gpt2-medium-persian and gpt2-medium-indonesian for sharing their training scripts!