Terjman-Large / README.md
BounharAbdelaziz's picture
Update README.md
aff104d verified
|
raw
history blame
4.39 kB
metadata
license: cc-by-nc-4.0
base_model: Helsinki-NLP/opus-mt-tc-big-en-ar
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: Terjman-Large
    results: []

Terjman-Large

This model is a fine-tuned version of Helsinki-NLP/opus-mt-tc-big-en-ar on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.2078
  • Bleu: 8.3292
  • Gen Len: 34.4959

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 22
  • eval_batch_size: 22
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 88
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 0.9982 407 4.3938 4.6056 22.6033
5.1616 1.9988 815 3.7257 5.8319 30.9201
3.902 2.9994 1223 3.5214 6.7311 32.9091
3.5737 4.0 1631 3.4204 7.3684 32.1433
3.4576 4.9982 2038 3.3562 7.8632 34.5399
3.4576 5.9988 2446 3.3151 7.9739 35.3278
3.3833 6.9994 2854 3.2884 8.0825 35.8292
3.3358 8.0 3262 3.2681 8.2765 34.5427
3.3069 8.9982 3669 3.2517 8.1019 33.584
3.2769 9.9988 4077 3.2404 8.106 33.3802
3.2769 10.9994 4485 3.2342 8.3037 33.303
3.2777 12.0 4893 3.2284 8.0674 33.3967
3.2476 12.9982 5300 3.2226 8.2883 33.8154
3.2611 13.9988 5708 3.2189 8.3537 34.0413
3.2511 14.9994 6116 3.2159 8.1365 34.5014
3.2437 16.0 6524 3.2140 8.3549 34.0606
3.2437 16.9982 6931 3.2131 8.2507 34.303
3.2498 17.9988 7339 3.2116 8.2928 33.9945
3.2341 18.9994 7747 3.2105 8.337 33.7052
3.2403 20.0 8155 3.2098 8.3179 34.3526
3.2229 20.9982 8562 3.2094 8.3848 34.2039
3.2229 21.9988 8970 3.2090 8.2042 34.6529
3.2379 22.9994 9378 3.2086 8.4227 34.0275
3.2257 24.0 9786 3.2082 8.3515 34.3306
3.2526 24.9982 10193 3.2085 8.4089 34.4986
3.2206 25.9988 10601 3.2082 8.476 34.6226
3.2288 26.9994 11009 3.2083 8.4452 33.697
3.2288 28.0 11417 3.2080 8.29 34.0331
3.2251 28.9982 11824 3.2080 8.35 34.2948
3.2302 29.9988 12232 3.2078 8.4408 33.416
3.21 30.9994 12640 3.2079 8.2934 34.0854
3.2271 32.0 13048 3.2079 8.4573 33.3912
3.2271 32.9982 13455 3.2078 8.4055 34.2452
3.2428 33.9988 13863 3.2079 8.5107 34.5152
3.2303 34.9994 14271 3.2080 8.3734 34.2562
3.2129 36.0 14679 3.2079 8.3193 34.4628
3.2119 36.9982 15086 3.2082 8.4122 34.2121
3.2119 37.9988 15494 3.2078 8.3585 33.8843
3.2445 38.9994 15902 3.2079 8.3968 34.6722
3.2356 39.9264 16280 3.2078 8.3292 34.4959

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1