synest-models / README.md

Update README.md (instructions for translation worker)

a8d6d3c verified 12 months ago

1.59 kB

	---
	language:
	- et
	- ar
	- de
	- en
	- fi
	- fr
	- lt
	- lv
	- ru
	- es
	- sv
	- uk
	- zh
	metrics:
	- bleu
	library_name: fairseq
	pipeline_tag: translation
	---
	# Model Card for SynEst Translation Models

	The SynEst models are machine translation models focused on translating from and into the Estonian language.

	## Model Details

	The models are based on the [NLLB-1.3B](https://huggingface.co/facebook/nllb-200-1.3B) multilingual model.
	The NLLB encoder is frozen, and a new, smaller decoder is trained for each target language.

	## Languages

	The models were trained to translate from Estonian into German, English, Finnish, Russian, Ukrainian, and Chinese,
	and into Estonian from Arabic, German, English, Finnish, French, Lithuanian, Latvian, Russian, Spanish,
	Swedish, Ukrainian, and Chinese.

	However, as the parameters of the NLLB encoder are frozen, they are capable of translating from any of the
	NLLB languages as well, albeit likely with a lower quality than for the languages on which they were
	fine-tuned.

	## How to Use the Model

	The easiest way to run the models is with the [dedicated branch of the TartuNLP translation worker](https://github.com/TartuNLP/translation-worker/tree/nllb-based-est)
	(place `nllb-based` with all its contents inside the `models/` directory).

	<!-- ## Evaluation

	#### Testing Data

	* [Flores](https://huggingface.co/datasets/facebook/flores) (devtest)
	* [MTee](https://github.com/Project-MTee/MTee_translation_benchmarks/tree/main/benchmark_datasets)

	#### Metrics

	BLEU

	### Results -->

	## Model Card Authors

	[@lisskor](https://huggingface.co/lisskor)