gsarti
/

opus-mt-tc-base-en-ja

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

opus-mt-tc-base-en-ja / README.md

Gabriele Sarti

Initial commit

7b254d5 over 2 years ago

|

1.67 kB

	---
	language:
	- en
	- ja
	tags:
	- translation
	- opus-mt-tc
	license: cc-by-4.0
	model-index:
	- name: opus-mt-tc-base-en-ja
	results:
	- task:
	name: Translation eng-jpg
	type: translation
	args: eng-jpg
	dataset:
	name: tatoeba-test-v2021-08-07
	type: tatoeba_mt
	args: eng-jpg
	metrics:
	- name: BLEU
	type: bleu
	value: 15.2
	---

	# Opus Tatoeba English-Japanese

	This model was obtained by running the script [convert_marian_to_pytorch.py](https://github.com/huggingface/transformers/blob/master/src/transformers/models/marian/convert_marian_to_pytorch.py) with the flag `-m eng-pol`. The original models were trained by [Jörg Tiedemann](https://blogs.helsinki.fi/tiedeman/) using the [MarianNMT](https://marian-nmt.github.io/) library. See all available `MarianMTModel` models on the profile of the [Helsinki NLP](https://huggingface.co/Helsinki-NLP) group.

	* dataset: opus+bt
	* model: transformer-align
	* source language(s): eng
	* target language(s): jpn
	* model: transformer-align
	* pre-processing: normalization + SentencePiece (spm32k,spm32k)
	* download: [opus+bt-2021-04-10.zip](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-jpn/opus+bt-2021-04-10.zip)
	* test set translations: [opus+bt-2021-04-10.test.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-jpn/opus+bt-2021-04-10.test.txt)
	* test set scores: [opus+bt-2021-04-10.eval.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-jpn/opus+bt-2021-04-10.eval.txt)

	## Benchmarks

	\| testset \| BLEU \| chr-F \| #sent \| #words \| BP \|
	\|---------\|-------\|-------\|-------\|--------\|----\|
	\| Tatoeba-test.eng-jpn \| 15.2 \| 0.258 \| 10000 \| 99206 \| 1.000 \|