M2M101

This model is a fine-tuned version of facebook/m2m100_418M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6766
  • Bleu: 15.3416
  • Meteor: 0.4723
  • Gen Len: 28.0271

Model description

This model translates Maharashtri Prakrit (an ancient Indo-Aryan Language) into English. It is fine-tuned on the M2M100 model.

Intended uses & limitations

The use of this model is educational and is intended for students, linguists and casual learners. This model is currently limited and sometimes could give incorrect translation because of small dataset and low resources ness of Maharashtri Prakrit

Training and evaluation data

This model was trained on Custom made dataset of 1474 Prakrit to English sentences. The evaluation of this model was done using BLEU and METEOR score.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Meteor Gen Len
No log 1.0 74 5.8261 1.1786 0.1997 36.3831
No log 2.0 148 4.6170 2.648 0.2657 37.6068
No log 3.0 222 3.5128 5.7069 0.3217 32.2169
No log 4.0 296 2.5281 6.3134 0.3547 31.8576
No log 5.0 370 1.7177 8.5036 0.38 29.9729
No log 6.0 444 1.1666 10.1169 0.3925 28.0678
3.5641 7.0 518 0.8702 10.4207 0.4246 31.1051
3.5641 8.0 592 0.7376 12.6153 0.431 28.6339
3.5641 9.0 666 0.6901 13.2966 0.4503 29.2373
3.5641 10.0 740 0.6713 11.9772 0.4396 30.5661
3.5641 11.0 814 0.6651 14.0436 0.4506 30.2
3.5641 12.0 888 0.6678 13.2632 0.4514 31.0542
3.5641 13.0 962 0.6677 14.0924 0.4563 29.278
0.5121 14.0 1036 0.6693 14.746 0.4651 28.4068
0.5121 15.0 1110 0.6698 14.9278 0.4677 28.5153
0.5121 16.0 1184 0.6700 14.7431 0.4674 28.9288
0.5121 17.0 1258 0.6744 15.2934 0.4701 28.8678
0.5121 18.0 1332 0.6741 15.6776 0.4712 28.3492
0.5121 19.0 1406 0.6772 14.942 0.4707 28.9695
0.5121 20.0 1480 0.6766 15.3416 0.4723 28.0271

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.4.0
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
17
Safetensors
Model size
484M params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for sarch7040/praTranv2

Finetuned
(55)
this model

Dataset used to train sarch7040/praTranv2

Space using sarch7040/praTranv2 1