Philip May
commited on
Commit
·
1cac1d8
1
Parent(s):
34a1402
Update README.md
Browse files
README.md
CHANGED
@@ -26,12 +26,12 @@ The training was conducted with the following hyperparameters:
|
|
26 |
|
27 |
- base model: [google/mt5-small](https://huggingface.co/google/mt5-small)
|
28 |
- source_prefix: `"summarize: "`
|
29 |
-
- batch size:
|
30 |
- max_source_length: 800
|
31 |
- max_target_length: 96
|
32 |
- warmup_ratio: 0.3
|
33 |
-
- number of train epochs:
|
34 |
-
- gradient accumulation steps:
|
35 |
|
36 |
## Datasets and Preprocessing
|
37 |
|
@@ -49,7 +49,7 @@ This model is trained on the following dataset:
|
|
49 |
|
50 |
| Model | rouge1 | rouge2 | rougeL | rougeLsum
|
51 |
|-------|--------|--------|--------|----------
|
52 |
-
| deutsche-telekom/mt5-small-sum-de-mit-v1 (this) |
|
53 |
| [ml6team/mt5-small-german-finetune-mlsum](https://huggingface.co/ml6team/mt5-small-german-finetune-mlsum) | 18.3607 | 5.3604 | 14.5456 | 16.1946
|
54 |
| **[deutsche-telekom/mt5-small-sum-de-en-01](https://huggingface.co/deutsche-telekom/mt5-small-sum-de-en-v1)** | **21.7336** | **7.2614** | **17.1323** | **19.3977**
|
55 |
|
|
|
26 |
|
27 |
- base model: [google/mt5-small](https://huggingface.co/google/mt5-small)
|
28 |
- source_prefix: `"summarize: "`
|
29 |
+
- batch size: 3 (6)
|
30 |
- max_source_length: 800
|
31 |
- max_target_length: 96
|
32 |
- warmup_ratio: 0.3
|
33 |
+
- number of train epochs: 10
|
34 |
+
- gradient accumulation steps: 2
|
35 |
|
36 |
## Datasets and Preprocessing
|
37 |
|
|
|
49 |
|
50 |
| Model | rouge1 | rouge2 | rougeL | rougeLsum
|
51 |
|-------|--------|--------|--------|----------
|
52 |
+
| deutsche-telekom/mt5-small-sum-de-mit-v1 (this) | 16.8023 | 3.5531 | 12.6884 | 14.7624
|
53 |
| [ml6team/mt5-small-german-finetune-mlsum](https://huggingface.co/ml6team/mt5-small-german-finetune-mlsum) | 18.3607 | 5.3604 | 14.5456 | 16.1946
|
54 |
| **[deutsche-telekom/mt5-small-sum-de-en-01](https://huggingface.co/deutsche-telekom/mt5-small-sum-de-en-v1)** | **21.7336** | **7.2614** | **17.1323** | **19.3977**
|
55 |
|