|
--- |
|
language: ug |
|
license: mit |
|
--- |
|
|
|
# gpt2-wechsel-uyghur |
|
|
|
Model trained with WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models. |
|
|
|
See the code here: https://github.com/CPJKU/wechsel |
|
|
|
And the paper here: https://arxiv.org/abs/2112.06598 |
|
|
|
## Performance |
|
|
|
| Model | PPL | |
|
|---|---| |
|
| `gpt2-wechsel-sundanese` | **111.72** | |
|
| `gpt2` (retrained from scratch) | 149.46 | |
|
|
|
| Model | PPL | |
|
|---|---| |
|
| `gpt2-wechsel-scottish-gaelic` | **16.43** | |
|
| `gpt2` (retrained from scratch) | 19.53 | |
|
|
|
| Model | PPL | |
|
|---|---| |
|
| `gpt2-wechsel-uyghur` | **34.33** | |
|
| `gpt2` (retrained from scratch) | 42.82 | |
|
|
|
| Model | PPL | |
|
|---|---| |
|
| `gpt2-wechsel-malagasy` | **14.01** | |
|
| `gpt2` (retrained from scratch) | 15.93 | |
|
|
|
See our paper for details. |
|
|
|
## Citation |
|
|
|
Please cite WECHSEL as |
|
|
|
``` |
|
@misc{minixhofer2021wechsel, |
|
title={WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models}, |
|
author={Benjamin Minixhofer and Fabian Paischer and Navid Rekabsaz}, |
|
year={2021}, |
|
eprint={2112.06598}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CL} |
|
} |
|
``` |
|
|