yesj1234's picture
Upload folder using huggingface_hub
1312242
metadata
language:
  - en
  - ko
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: enko_mbartLarge_36p_exp1
    results: []

enko_mbartLarge_36p_exp1

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2181
  • Bleu: 15.4063
  • Gen Len: 14.7808

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.4235 0.46 5000 1.3893 12.3168 14.6634
1.3281 0.93 10000 1.2917 14.3522 14.9186
1.2506 1.39 15000 1.2669 14.3525 14.9494
1.1603 1.86 20000 1.2283 15.248 15.0062
1.0765 2.32 25000 1.2181 15.4063 14.7808
1.1019 2.79 30000 1.2753 14.3608 14.9014
1.0504 3.25 35000 1.2334 15.3253 14.7948
0.9431 3.72 40000 1.2512 15.2534 14.7293
0.8394 4.18 45000 1.2971 14.9999 14.7993

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.6
  • Tokenizers 0.14.1