metadata

language:
  - en
  - ko
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: mbartLarge_mid_en-ko1
    results: []

mbartLarge_mid_en-ko1

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.4106
Bleu: 13.2758
Gen Len: 16.235

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 32
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 500
num_epochs: 40

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.5855	1.12	1500	1.5215	11.5186	16.204
1.4287	2.24	3000	1.4549	12.2855	16.1497
1.2937	3.37	4500	1.4250	12.6484	16.2152
1.2444	4.49	6000	1.4165	13.0063	16.0749
1.1335	5.61	7500	1.4106	13.2758	16.235
1.0508	6.73	9000	1.4243	13.0601	15.86
0.9462	7.86	10500	1.4497	13.0828	16.0475
0.8464	8.98	12000	1.4692	13.5878	15.9308
0.6995	10.1	13500	1.5572	13.1085	15.9906

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.14.5
Tokenizers 0.14.1