metadata

language:
  - ko
  - en
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: ko-en_mbartLarge_exp5p_linear
    results: []

ko-en_mbartLarge_exp5p_linear

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.2308
Bleu: 26.2764
Gen Len: 18.3888

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 32
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 40

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.7665	0.46	1000	1.6564	17.6773	18.7196
1.5688	0.93	2000	1.4939	20.8837	18.3983
1.457	1.39	3000	1.4350	21.9168	18.458
1.4107	1.86	4000	1.3752	22.8881	18.4826
1.3039	2.32	5000	1.3327	23.8115	18.4348
1.282	2.78	6000	1.3079	24.235	18.3561
1.2133	3.25	7000	1.2820	24.8877	18.5204
1.1787	3.71	8000	1.2580	25.2719	18.415
1.1154	4.18	9000	1.2543	25.5507	18.3528
1.0956	4.64	10000	1.2415	25.7284	18.5348
1.023	5.1	11000	1.2410	25.7912	18.3347
0.95	5.57	12000	1.2327	25.9921	18.2593
0.9476	6.03	13000	1.2631	25.829	18.3686
0.9061	6.5	14000	1.2548	25.8316	18.7481
0.9037	6.96	15000	1.2308	26.2764	18.3888
0.7431	7.42	16000	1.2716	25.9268	18.256
0.7526	7.89	17000	1.2655	25.9883	18.2052
0.6654	8.35	18000	1.3118	25.6866	18.2217
0.6953	8.82	19000	1.3050	25.8958	18.3387

Framework versions

Transformers 4.34.0
Pytorch 2.1.0+cu121
Datasets 2.14.5
Tokenizers 0.14.1