metadata

language:
  - ko
  - en
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: tst-translation-output
    results: []

tst-translation-output

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.4429
Bleu: 21.4825
Gen Len: 18.792

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 16
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 40

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.3784	0.23	2000	1.5514	18.2602	19.2938
1.2953	0.46	4000	1.5006	19.6277	18.7905
1.2446	0.7	6000	1.4664	20.2667	19.2503
1.2095	0.93	8000	1.4482	20.8962	18.9352
0.9279	1.16	10000	1.4799	20.9876	19.093
0.9604	1.39	12000	1.4672	21.261	18.8735
0.9543	1.62	14000	1.4611	21.1987	18.8396
0.9532	1.86	16000	1.4429	21.4802	18.8239
0.6681	2.09	18000	1.5450	21.1981	18.6116
0.6971	2.32	20000	1.5516	21.3101	18.892
0.7283	2.55	22000	1.5405	20.902	18.6448
0.7308	2.78	24000	1.5363	21.3017	18.2578

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.14.5
Tokenizers 0.14.1