Galuh
commited on
Commit
•
79140ed
1
Parent(s):
311c294
Update README.md
Browse files
README.md
CHANGED
@@ -60,17 +60,17 @@ The training data used for this model has not been released as a dataset one can
|
|
60 |
The model was trained on a combined dataset of [OSCAR](https://oscar-corpus.com/) and [mc4](https://huggingface.co/datasets/mc4) for the Indonesian language, with 29GB of data in total. The mc4 dataset was cleaned using [this script](https://github.com/Wikidepia/indonesian_datasets/blob/master/dump/mc4/cleanup.py) and we also only included links that were cited by IDWiki.
|
61 |
|
62 |
## Training procedure
|
63 |
-
The model was trained on a TPUv3-8 VM provided by the Google Cloud team. The training duration was `
|
64 |
|
65 |
### Evaluation results
|
66 |
The model achieves the following results without any fine-tuning (zero-shot):
|
67 |
|
68 |
| dataset | train loss | eval loss | eval perplexity |
|
69 |
| ---------- | ---------- | -------------- | ---------- |
|
70 |
-
| ID OSCAR+mc4 (29GB) |
|
71 |
|
72 |
### Tracking
|
73 |
-
The training process was tracked in [TensorBoard](https://huggingface.co/flax-community/gpt2-
|
74 |
|
75 |
## Team members
|
76 |
- Akmal ([@Wikidepia](https://huggingface.co/Wikidepia))
|
|
|
60 |
The model was trained on a combined dataset of [OSCAR](https://oscar-corpus.com/) and [mc4](https://huggingface.co/datasets/mc4) for the Indonesian language, with 29GB of data in total. The mc4 dataset was cleaned using [this script](https://github.com/Wikidepia/indonesian_datasets/blob/master/dump/mc4/cleanup.py) and we also only included links that were cited by IDWiki.
|
61 |
|
62 |
## Training procedure
|
63 |
+
The model was trained on a TPUv3-8 VM provided by the Google Cloud team. The training duration was `6d 3h 7m 26s`.
|
64 |
|
65 |
### Evaluation results
|
66 |
The model achieves the following results without any fine-tuning (zero-shot):
|
67 |
|
68 |
| dataset | train loss | eval loss | eval perplexity |
|
69 |
| ---------- | ---------- | -------------- | ---------- |
|
70 |
+
| ID OSCAR+mc4 (29GB) | 2.79 | 2.696 | 14.826 |
|
71 |
|
72 |
### Tracking
|
73 |
+
The training process was tracked in [TensorBoard](https://huggingface.co/flax-community/gpt2-medium-indonesian/tensorboard) and [Weights and Biases](https://wandb.ai/wandb/hf-flax-gpt2-indonesian?workspace=user-cahya).
|
74 |
|
75 |
## Team members
|
76 |
- Akmal ([@Wikidepia](https://huggingface.co/Wikidepia))
|