nullonesix
/

distil-small.en

Automatic Speech Recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Whisper Large v3 1500 Epochs 2 - nullonesix

This model is a fine-tuned version of distil-whisper/distil-small.en on the atc dataset. It achieves the following results on the evaluation set:

Loss: 1.4151
Wer: 39.2349

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 1500
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
2.8313	3.5714	100	2.7177	74.1548
1.1366	7.1429	200	1.6407	63.0338
0.4394	10.7143	300	1.4737	47.4644
0.1686	14.2857	400	1.4481	46.3968
0.0761	17.8571	500	1.3707	40.8808
0.0452	21.4286	600	1.4051	38.5231
0.0188	25.0	700	1.4044	36.7883
0.0167	28.5714	800	1.4217	38.8345
0.0084	32.1429	900	1.4120	48.5765
0.0033	35.7143	1000	1.4151	39.2349
0.0022	39.2857	1100	1.4401	39.7242
0.0008	42.8571	1200	1.4591	39.5907
0.0007	46.4286	1300	1.4679	39.5907
0.0006	50.0	1400	1.4724	39.8577
0.0007	53.5714	1500	1.4737	39.7242

Framework versions

Transformers 4.42.3
Pytorch 2.3.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 7

Safetensors

Model size

166M params

Tensor type

F32

·

Inference Examples

Automatic Speech Recognition

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nullonesix/distil-small.en

Base model

distil-whisper/distil-small.en

Quantized

(3)

this model

Evaluation results

Wer on atc
self-reported

39.235

View on Papers With Code