|
--- |
|
language: |
|
- et |
|
- ar |
|
- de |
|
- en |
|
- fi |
|
- fr |
|
- lt |
|
- lv |
|
- ru |
|
- es |
|
- sv |
|
- uk |
|
- zh |
|
metrics: |
|
- bleu |
|
library_name: fairseq |
|
pipeline_tag: translation |
|
--- |
|
# Model Card for SynEst Translation Models |
|
|
|
The SynEst models are machine translation models focused on translating from and into the Estonian language. |
|
|
|
## Model Details |
|
|
|
The models are based on the [NLLB-1.3B](https://huggingface.co/facebook/nllb-200-1.3B) multilingual model. |
|
The NLLB encoder is frozen, and a new, smaller decoder is trained for each target language. |
|
|
|
## Languages |
|
|
|
The models were trained to translate from Estonian into German, English, Finnish, Russian, Ukrainian, and Chinese, |
|
and into Estonian from Arabic, German, English, Finnish, French, Lithuanian, Latvian, Russian, Spanish, |
|
Swedish, Ukrainian, and Chinese. |
|
|
|
However, as the parameters of the NLLB encoder are frozen, they are capable of translating from any of the |
|
NLLB languages as well, albeit likely with a lower quality than for the languages on which they were |
|
fine-tuned. |
|
|
|
## How to Use the Model |
|
|
|
The easiest way to run the models is with the [dedicated branch of the TartuNLP translation worker](https://github.com/TartuNLP/translation-worker/tree/nllb-based-est) |
|
(place `nllb-based` with all its contents inside the `models/` directory). |
|
|
|
<!-- ## Evaluation |
|
|
|
#### Testing Data |
|
|
|
* [Flores](https://huggingface.co/datasets/facebook/flores) (devtest) |
|
* [MTee](https://github.com/Project-MTee/MTee_translation_benchmarks/tree/main/benchmark_datasets) |
|
|
|
#### Metrics |
|
|
|
BLEU |
|
|
|
### Results --> |
|
|
|
## Model Card Authors |
|
|
|
[@lisskor](https://huggingface.co/lisskor) |