Update README.md
Browse files
README.md
CHANGED
@@ -1,56 +1,43 @@
|
|
1 |
---
|
2 |
-
|
3 |
-
|
4 |
-
model-index:
|
5 |
-
- name: roberta_large_ukrainian
|
6 |
-
results: []
|
7 |
---
|
8 |
|
9 |
-
|
10 |
-
should probably proofread and complete it, then remove this comment. -->
|
11 |
|
12 |
-
|
13 |
|
14 |
-
|
15 |
|
16 |
-
|
17 |
|
18 |
-
|
19 |
|
20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
-
|
23 |
|
24 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
|
26 |
-
|
27 |
|
28 |
-
## Training procedure
|
29 |
|
30 |
-
|
31 |
|
32 |
-
|
33 |
-
- learning_rate: 0.0001
|
34 |
-
- train_batch_size: 4
|
35 |
-
- eval_batch_size: 4
|
36 |
-
- seed: 42
|
37 |
-
- distributed_type: tpu
|
38 |
-
- num_devices: 8
|
39 |
-
- gradient_accumulation_steps: 16
|
40 |
-
- total_train_batch_size: 512
|
41 |
-
- total_eval_batch_size: 32
|
42 |
-
- optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-06
|
43 |
-
- lr_scheduler_type: linear
|
44 |
-
- lr_scheduler_warmup_steps: 25000
|
45 |
-
- training_steps: 250000
|
46 |
-
|
47 |
-
### Training results
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
### Framework versions
|
52 |
-
|
53 |
-
- Transformers 4.18.0.dev0
|
54 |
-
- Pytorch 1.10.0+cu102
|
55 |
-
- Datasets 1.18.4
|
56 |
-
- Tokenizers 0.11.6
|
|
|
1 |
---
|
2 |
+
license: mit
|
3 |
+
language: uk
|
|
|
|
|
|
|
4 |
---
|
5 |
|
6 |
+
# roberta-base-wechsel-ukrainian
|
|
|
7 |
|
8 |
+
[`roberta-base`](https://huggingface.co/roberta-base) transferred to Ukrainian using the method from the NAACL2022 paper [WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models](https://arxiv.org/abs/2112.065989).
|
9 |
|
10 |
+
# Evaluation
|
11 |
|
12 |
+
Evaluation was done on [lang-uk's ner-uk project](https://github.com/lang-uk/ner-uk), the Ukrainian portion of [WikiANN](https://huggingface.co/datasets/wikiann) and the [Ukrainian IU corpus from the Universal Dependencies project](https://github.com/UniversalDependencies/UD_Ukrainian-IU).
|
13 |
|
14 |
+
__Validation Results__
|
15 |
|
16 |
+
| | lang-uk NER (Micro F1) | WikiANN (Micro F1) | UD Ukrainian IU POS (Accuracy) |
|
17 |
+
|:-------------------------------------------------|:-------------------------|:-------------|:-------------------------|
|
18 |
+
| roberta-base-wechsel-ukrainian | 88.06 (0.50) | 92.96 (0.08) | 98.70 (0.05) |
|
19 |
+
| roberta-large-wechsel-ukrainian | __89.27 (0.53)__ | __93.22 (0.15)__ | __98.86 (0.03)__ |
|
20 |
+
| roberta-base-scratch-ukrainian* | 85.49 (0.88) | 91.91 (0.08) | 98.49 (0.04) |
|
21 |
+
| roberta-large-scratch-ukrainian* | 86.54 (0.70) | 92.39 (0.16) | 98.65 (0.09) |
|
22 |
+
| dbmdz/electra-base-ukrainian-cased-discriminator | 87.49 (0.52) | 93.20 (0.16) | 98.60 (0.03) |
|
23 |
+
| xlm-roberta-base | 86.68 (0.44) | 92.41 (0.13) | 98.53 (0.02) |
|
24 |
+
| xlm-roberta-large | 86.64 (1.61) | 93.01 (0.13) | 98.71 (0.04) |
|
25 |
|
26 |
+
__Test Results__
|
27 |
|
28 |
+
| | lang-uk NER (Micro F1) | WikiANN (Micro F1) | UD Ukrainian IU POS (Accuracy) |
|
29 |
+
|:-------------------------------------------------|:-------------------------|:-------------|:-------------------------|
|
30 |
+
| roberta-base-wechsel-ukrainian | 90.81 (1.51) | 92.98 (0.12) | 98.57 (0.03) |
|
31 |
+
| roberta-large-wechsel-ukrainian | __91.24 (1.16)__ | __93.22 (0.17)__ | __98.74 (0.06)__ |
|
32 |
+
| roberta-base-scratch-ukrainian* | 89.57 (1.01) | 92.05 (0.09) | 98.31 (0.08) |
|
33 |
+
| roberta-large-scratch-ukrainian* | 89.96 (0.89) | 92.49 (0.15) | 98.52 (0.04) |
|
34 |
+
| dbmdz/electra-base-ukrainian-cased-discriminator | 90.43 (1.29) | 92.99 (0.11) | 98.59 (0.06) |
|
35 |
+
| xlm-roberta-base | 90.86 (0.81) | 92.27 (0.09) | 98.45 (0.07) |
|
36 |
+
| xlm-roberta-large | 90.16 (2.98) | 92.92 (0.19) | 98.71 (0.04) |
|
37 |
|
38 |
+
\*trained using the same exact training setup as the wechsel-\* models, but without parameter transfer.
|
39 |
|
|
|
40 |
|
41 |
+
# License
|
42 |
|
43 |
+
MIT
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|