whisper-large-v3-turbo-darija-st

This model is a fine-tuned version of openai/whisper-large-v3-turbo on the Darija-C dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4467
  • Bleu: 0.1543

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 1000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu
9.4513 12.5 50 7.2648 0.0
6.3191 25.0 100 5.6246 0.0
5.3703 37.5 150 5.0543 0.0
4.8443 50.0 200 4.4717 0.0
4.1997 62.5 250 3.8060 0.0
3.4159 75.0 300 3.0214 0.0
2.6566 87.5 350 2.3394 0.0000
2.1582 100.0 400 2.0106 0.0000
1.902 112.5 450 1.8156 0.0016
1.7201 125.0 500 1.5723 0.0000
1.4377 137.5 550 1.2928 0.0044
1.1887 150.0 600 1.0744 0.0038
0.9863 162.5 650 0.9181 0.0311
0.8339 175.0 700 0.7674 0.1133
0.7106 187.5 750 0.6533 0.1300
0.6131 200.0 800 0.5704 0.1339
0.5454 212.5 850 0.5155 0.1336
0.4952 225.0 900 0.4789 0.1210
0.4647 237.5 950 0.4567 0.1969
0.4461 250.0 1000 0.4467 0.1543

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu121
  • Datasets 2.19.2
  • Tokenizers 0.21.0
Downloads last month
9
Safetensors
Model size
809M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Marialab/whisper-large-v3-turbo-parameterized-gelu

Finetuned
(128)
this model