mt5v2

This model is a fine-tuned version of thenameisdeba/results_mt5 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6690
  • Score: 0.1
  • Counts: [259, 47, 3, 1]
  • Totals: [14640, 14057, 13519, 13151]
  • Precisions: [1.7691256830601092, 0.33435299139218894, 0.022190990457874104, 0.007603984487871644]
  • Bp: 1.0
  • Sys Len: 14640
  • Ref Len: 1945
  • Gen Len: 117.3533

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Score Counts Totals Precisions Bp Sys Len Ref Len Gen Len
3.6412 0.7622 500 2.8887 0.1453 [209, 42, 3, 1] [9552, 8969, 8447, 8153] [2.1880234505862646, 0.4682796298361021, 0.03551556765715639, 0.012265423770391267] 1.0 9552 1945 89.7581
3.109 1.5244 1000 2.7566 0.0687 [258, 47, 2, 0] [16092, 15509, 14949, 14563] [1.6032811334824757, 0.30304984202721, 0.013378821325841193, 0.0034333585112957493] 1.0 16092 1945 122.3619
2.764 2.2866 1500 2.6690 0.1 [259, 47, 3, 1] [14640, 14057, 13519, 13151] [1.7691256830601092, 0.33435299139218894, 0.022190990457874104, 0.007603984487871644] 1.0 14640 1945 117.3533

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.7.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.21.1
Downloads last month
4
Safetensors
Model size
582M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support