Whisper Large v3 1500 Epochs 2 - nullonesix

This model is a fine-tuned version of distil-whisper/distil-small.en on the atc dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4151
  • Wer: 39.2349

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 1500
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
2.8313 3.5714 100 2.7177 74.1548
1.1366 7.1429 200 1.6407 63.0338
0.4394 10.7143 300 1.4737 47.4644
0.1686 14.2857 400 1.4481 46.3968
0.0761 17.8571 500 1.3707 40.8808
0.0452 21.4286 600 1.4051 38.5231
0.0188 25.0 700 1.4044 36.7883
0.0167 28.5714 800 1.4217 38.8345
0.0084 32.1429 900 1.4120 48.5765
0.0033 35.7143 1000 1.4151 39.2349
0.0022 39.2857 1100 1.4401 39.7242
0.0008 42.8571 1200 1.4591 39.5907
0.0007 46.4286 1300 1.4679 39.5907
0.0006 50.0 1400 1.4724 39.8577
0.0007 53.5714 1500 1.4737 39.7242

Framework versions

  • Transformers 4.42.3
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
7
Safetensors
Model size
166M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nullonesix/distil-small.en

Quantized
(3)
this model

Evaluation results