Whisper Large GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-large on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia dataset. It achieves the following results on the evaluation set:

Loss: 1.1318
Bleu: 31.26
Chrf: 50.41
Wer: 62.3143

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 0.03
training_steps: 3000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Bleu	Chrf	Validation Loss	Wer
3.1547	0.03	100	3.75	18.71	2.4173	124.0882
2.6996	0.07	200	8.16	25.45	2.1329	114.1378
2.4841	0.1	300	6.4	23.6	2.0262	158.1720
2.4706	0.13	400	9.16	27.67	1.9688	120.0810
2.3575	0.16	500	13.66	31.5	1.8284	100.8555
2.1916	0.2	600	12.97	31.8	1.7486	110.1756
2.1353	0.23	700	16.7	33.52	1.7568	86.8528
1.9885	0.26	800	19.34	35.35	1.6395	78.7033
1.9126	0.3	900	20.21	36.28	1.5658	78.2080
1.6418	0.33	1000	18.61	38.49	1.4998	86.8528
1.5782	0.36	1100	22.91	40.04	1.4716	71.0941
1.4899	0.39	1200	21.55	40.92	1.4444	78.7933
1.3155	0.43	1300	24.95	42.05	1.3934	70.9140
1.4144	0.46	1400	28.38	46.18	1.2791	65.8262
1.1949	0.49	1500	26.95	45.84	1.2879	70.6889
1.0179	0.53	1600	26.12	46.4	1.2624	69.6983
1.0935	0.56	1700	28.51	48.24	1.2076	67.4021
1.061	0.59	1800	27.42	48.83	1.1812	71.4543
1.0955	0.62	1900	31.32	49.91	1.1503	62.9896
1.0607	0.66	2000	31.26	50.41	1.1318	62.3143
1.1135	0.6897	2100	1.2135	26.57	46.18	69.7884
0.9819	0.7225	2200	1.2252	26.95	49.47	65.0158
0.9909	0.7553	2300	1.2072	30.35	46.49	63.3048
0.9521	0.7882	2400	1.2130	24.76	46.44	70.6889
0.8245	0.8210	2500	1.1724	24.84	47.05	78.1630
0.8303	0.8539	2600	1.1812	27.56	47.48	70.1036
0.6934	0.8867	2700	1.1716	31.61	50.4	63.8001
0.7117	0.9195	2800	1.1650	30.82	49.95	65.0158
0.6944	0.9524	2900	1.1516	31.21	49.8	63.5750
0.7132	0.9852	3000	1.1390	30.16	49.77	65.6011

Framework versions

Transformers 4.40.0
Pytorch 2.0.1+cu118
Datasets 2.18.0
Tokenizers 0.19.1

ymoslem
/

whisper-large-ga2en-v2.1

Whisper Large GA-EN Speech Translation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ymoslem/whisper-large-ga2en-v2.1

Datasets used to train ymoslem/whisper-large-ga2en-v2.1

Collection including ymoslem/whisper-large-ga2en-v2.1

Speech Translation (Irish-English)

Evaluation results