Whisper Medium GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-small on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1121
  • Bleu: 36.46
  • Chrf: 55.74
  • Wer: 58.2620

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.02
  • training_steps: 10000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Bleu Chrf Validation Loss Wer
2.6534 0.0138 100 1.43 15.99 2.2446 269.1130
2.4519 0.0276 200 2.13 18.36 2.1941 250.5178
2.2928 0.0414 300 7.14 25.95 2.0086 128.3656
2.233 0.0552 400 5.61 24.25 2.0239 134.0837
2.0406 0.0690 500 5.64 25.65 1.9215 183.8361
2.0273 0.0828 600 13.41 30.96 1.8556 83.7010
1.895 0.0966 700 7.02 26.82 1.8278 158.2170
1.9889 0.1103 800 12.22 31.62 1.7842 99.6398
1.8484 0.1241 900 10.97 30.45 1.7648 91.1751
1.7491 0.1379 1000 10.0 29.42 1.7498 109.0050
1.699 0.1517 1100 12.53 34.87 1.6662 109.9054
1.6959 0.1655 1200 14.54 34.8 1.6287 92.3008
1.6682 0.1793 1300 13.26 33.5 1.5800 103.0617
1.6625 0.1931 1400 19.71 37.33 1.6115 75.9118
1.5462 0.2069 1500 18.3 39.49 1.4993 93.7866
1.3834 0.2207 1600 20.32 40.87 1.4906 79.2436
1.39 0.2345 1700 17.3 38.16 1.4752 93.1562
1.5061 0.2483 1800 20.11 39.69 1.4004 81.0446
1.4125 0.2621 1900 23.82 42.67 1.3854 73.3904
1.3181 0.2759 2000 20.57 40.87 1.3979 78.8384
1.283 0.2897 2100 17.97 40.47 1.3446 88.8789
1.2061 0.3034 2200 25.12 45.42 1.3130 73.5254
1.2091 0.3172 2300 22.12 43.56 1.3274 79.8739
1.1264 0.3310 2400 22.94 45.96 1.2771 78.2080
1.0972 0.3448 2500 24.38 46.04 1.2858 75.4615
1.0822 0.3586 2600 27.39 48.34 1.2376 67.6722
1.0316 0.3724 2700 28.0 47.61 1.2461 68.5277
1.165 0.3862 2800 26.05 48.13 1.1869 71.6794
1.025 0.4 2900 27.14 47.91 1.1716 68.7528
0.8978 0.4138 3000 28.34 49.15 1.1628 65.6461
0.9146 0.4276 3100 25.81 48.42 1.1703 71.7244
0.9764 0.4414 3200 29.63 51.22 1.1526 67.3570
0.9455 0.4552 3300 25.31 49.73 1.1108 72.6249
0.9073 0.4690 3400 27.7 50.85 1.1085 72.7150
0.8596 0.4828 3500 28.34 52.39 1.0927 67.9424
0.8241 0.4966 3600 29.95 51.37 1.1026 65.2859
0.8436 0.5103 3700 27.18 51.45 1.0718 71.2292
0.8318 0.5241 3800 30.71 53.35 1.0678 64.3404
0.8262 0.5379 3900 27.05 51.94 1.0534 71.5894
0.8129 0.5517 4000 27.38 51.97 1.0491 72.1747
0.9036 0.5655 4100 14.43 40.57 1.2250 139.3066
1.0314 0.5793 4200 24.27 46.97 1.2310 75.5966
0.9209 0.5931 4300 23.55 46.04 1.2447 76.4070
0.9204 0.6069 4400 25.87 45.32 1.2891 73.0302
0.9843 0.6207 4500 27.2 46.36 1.2269 71.8145
1.0225 0.6345 4600 26.16 45.72 1.2403 69.6983
0.9773 0.6483 4700 26.37 45.62 1.2464 68.4376
0.9794 0.6621 4800 24.77 47.11 1.2461 72.0846
0.8905 0.6759 4900 24.58 46.35 1.2345 71.2742
0.8305 0.6897 5000 27.28 48.37 1.2239 68.1675
0.9019 0.7034 5100 27.04 50.28 1.1730 70.1486
0.7969 0.7172 5200 26.27 48.07 1.1807 69.0230
0.8036 0.7310 5300 23.04 48.3 1.1632 77.5326
0.8195 0.7448 5400 25.58 50.29 1.1811 76.2269
0.7697 0.7586 5500 23.99 48.91 1.1825 81.4948
0.727 0.7724 5600 23.93 49.23 1.1623 79.5137
0.8002 0.7862 5700 26.29 50.44 1.1503 75.6866
0.6909 0.8 5800 29.27 50.85 1.1338 64.0252
0.7146 0.8138 5900 28.24 50.82 1.1420 66.6367
0.7452 0.8276 6000 31.33 51.92 1.1328 62.4944
0.5989 0.8414 6100 31.1 52.15 1.1455 65.1959
0.6818 0.8552 6200 32.56 52.46 1.1112 62.1342
0.6074 0.8690 6300 33.48 53.32 1.1072 60.6033
0.5942 0.8828 6400 31.39 51.03 1.1462 62.8546
0.6341 0.8966 6500 31.55 52.15 1.1093 62.4043
0.5992 0.9103 6600 33.06 52.52 1.1215 61.4588
0.6156 0.9241 6700 32.38 52.76 1.1031 62.9446
0.6169 0.9379 6800 31.46 52.96 1.1082 64.3404
0.6543 0.9517 6900 33.49 54.02 1.0943 63.1247
0.5017 0.9655 7000 30.95 52.64 1.1141 68.6177
0.5583 0.9793 7100 34.39 54.03 1.1004 61.6839
0.5986 0.9931 7200 33.92 52.85 1.1055 62.4944
0.2443 1.0069 7300 34.86 53.01 1.1442 60.1981
0.254 1.0207 7400 33.92 53.25 1.1458 62.1792
0.2827 1.0345 7500 34.49 53.43 1.1190 60.6484
0.2326 1.0483 7600 35.47 53.53 1.1237 59.2076
0.2017 1.0621 7700 34.65 53.87 1.1179 60.0180
0.2367 1.0759 7800 34.23 53.67 1.1075 60.6484
0.2276 1.0897 7900 34.67 54.51 1.1063 60.3332
0.2087 1.1034 8000 34.44 54.07 1.1090 60.6484
0.2514 1.1172 8100 1.1199 29.85 51.91 69.6083
0.2692 1.1310 8200 1.1642 28.05 51.94 72.1747
0.2784 1.1448 8300 1.1262 27.26 50.77 74.8312
0.2539 1.1586 8400 1.1463 30.7 53.1 65.0158
0.2599 1.1724 8500 1.1255 31.64 53.71 63.2148
0.2419 1.1862 8600 1.1223 33.2 54.15 62.4043
0.2583 1.2 8700 1.1304 33.98 53.65 61.2787
0.239 1.2138 8800 1.1371 34.68 54.35 61.7740
0.2198 1.2276 8900 1.1533 30.65 52.15 72.2647
0.248 1.2414 9000 1.1266 31.98 53.68 65.4210
0.2377 1.2552 9100 1.1510 30.9 53.6 67.9424
0.2183 1.2690 9200 1.1565 30.35 53.04 73.1202
0.1999 1.2828 9300 1.1426 29.48 53.0 74.2909
0.22 1.2966 9400 1.1332 31.93 53.16 66.1414
0.2063 1.3103 9500 1.1144 32.42 53.79 63.3949
0.2054 1.3241 9600 1.1146 33.64 54.69 61.5038
0.2145 1.3379 9700 1.1123 36.68 55.64 57.5867
0.2059 1.3517 9800 1.1102 36.93 56.15 57.5416
0.2001 1.3655 9900 1.1143 36.4 56.09 57.9469
0.1973 1.3793 10000 1.1121 36.46 55.74 58.2620

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
18
Safetensors
Model size
764M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ymoslem/whisper-medium-ga2en-v6.2.2-r

Finetuned
(2102)
this model

Datasets used to train ymoslem/whisper-medium-ga2en-v6.2.2-r

Evaluation results

  • Bleu on IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop
    self-reported
    36.460
  • Wer on IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop
    self-reported
    58.262