Whisper Small GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-small on the IWSLT-2023, FLEURS, BiteSize, and SpokenWords datasets. The best model checkpoint (this version) based on ChrF is at step 3300, epoch 3.67, and it achieves the following results on the evaluation set:

Loss: 1.5823
Bleu: 29.81
Chrf: 46.50
Wer: 66.7267

The best checkpoint based on BLEU achieves the following results:

Loss: 1.5752
Bleu: 30.77
Chrf: 46.43
Wer: 64.6556

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Experiment

language=English
+more steps

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 0.03
training_steps: 4000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Bleu	Chrf	Validation Loss	Wer
2.4954	0.11	100	3.7	18.03	2.1286	179.7839
2.045	0.22	200	12.65	25.53	1.8146	100.9005
1.7928	0.32	300	13.78	30.2	1.7253	101.9811
1.6615	0.43	400	15.8	31.88	1.6834	92.5259
1.4491	0.54	500	15.61	36.27	1.5971	107.3841
1.2074	0.65	600	19.92	36.31	1.5939	84.3314
1.2308	0.76	700	20.37	38.72	1.5234	84.8267
1.107	0.86	800	21.35	37.87	1.5460	82.8906
0.9491	0.97	900	21.06	40.74	1.5161	82.5754
0.384	1.08	1000	23.24	41.98	1.4927	82.2152
0.362	1.19	1100	23.19	42.24	1.5567	80.2792
0.3756	1.29	1200	27.83	43.8	1.5265	69.2481
0.3401	1.4	1300	21.79	41.66	1.5522	92.3908
0.3346	1.51	1400	24.61	42.15	1.5085	75.4615
0.3101	1.62	1500	26.67	43.41	1.4933	70.7789
0.3231	1.73	1600	27.95	42.82	1.4979	68.3026
0.2665	1.83	1700	28.5	43.76	1.4977	68.1225
0.2704	1.94	1800	28.15	43.87	1.5063	68.8429
0.0769	2.05	1900	25.76	43.22	1.5162	77.6227
0.0597	2.16	2000	25.04	43.15	1.5216	79.0635
0.0743	2.27	2100	27.85	44.43	1.5313	68.3926
0.0878	2.37	2200	27.54	43.96	1.5495	68.3476
0.0712	2.48	2300	28.28	44.39	1.5355	65.8712
0.0789	2.59	2400	28.64	44.75	1.5277	65.7812
0.073	2.7	2500	29.09	44.65	1.5327	65.7812
0.073	2.8	2600	25.26	43.44	1.5304	78.2981
0.0697	2.91	2700	25.71	43.02	1.5460	78.4782
0.0398	3.02	2800	28.26	44.71	1.5580	72.8501
0.0302	3.13	2900	30.25	45.46	1.5688	66.1414
0.0424	3.24	3000	29.88	45.21	1.5693	66.0964
0.0397	3.34	3100	30.01	45.85	1.5934	65.6911
0.0346	3.45	3200	30.2	45.8	1.5818	65.8262
0.032	3.56	3300	29.81	46.5	1.5823	66.7267
0.0348	3.67	3400	30.77	46.43	1.5752	64.6556
0.0277	3.78	3500	30.3	46.02	1.5791	64.6105
0.0364	3.88	3600	29.92	45.38	1.5895	65.0608
0.0398	3.99	3700	27.79	44.59	1.6167	68.2575
0.0152	4.1	3800	28.42	44.83	1.6241	67.5822
0.0201	4.21	3900	29.02	45.11	1.6243	67.4921
0.0168	4.31	4000	26.85	44.41	1.6195	73.5254

Framework versions

Transformers 4.39.3
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2

ymoslem
/

whisper-small-ga2en-v1.2

Whisper Small GA-EN Speech Translation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Experiment

Training hyperparameters

Training results

Framework versions

Model tree for ymoslem/whisper-small-ga2en-v1.2

Datasets used to train ymoslem/whisper-small-ga2en-v1.2

Collection including ymoslem/whisper-small-ga2en-v1.2

Speech Translation (Irish-English)

Evaluation results