metadata

base_model: openai/whisper-medium
datasets:
  - miosipof/asr_en
language:
  - en
library_name: peft
license: apache-2.0
metrics:
  - wer
tags:
  - generated_from_trainer
model-index:
  - name: Whisper Medium
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: miosipof/asr_en
          type: miosipof/asr_en
          config: default
          split: train
          args: default
        metrics:
          - type: wer
            value: 106.7911714770798
            name: Wer

Whisper Medium

This model is a fine-tuned version of openai/whisper-medium on the miosipof/asr_en dataset. It achieves the following results on the evaluation set:

Loss: 3.9473
Wer: 106.7912

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 128
training_steps: 1024
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
7.4432	4.2667	64	7.9570	167.9117
6.9871	8.5333	128	7.0858	167.5722
6.1972	12.8	192	6.3333	205.2632
5.9006	17.0667	256	6.0843	203.5654
5.61	21.3333	320	5.8153	168.7606
5.2344	25.6	384	5.4746	168.0815
4.8067	29.8667	448	5.0913	168.2513
4.3927	34.1333	512	4.7586	201.5280
4.1103	38.4	576	4.5158	164.5161
3.8975	42.6667	640	4.3460	108.4890
3.7471	46.9333	704	4.2178	109.1681
3.6146	51.2	768	4.1226	108.3192
3.53	55.4667	832	4.0471	107.9796
3.4579	59.7333	896	3.9927	107.4703
3.4061	64.0	960	3.9594	106.9610
3.3577	68.2667	1024	3.9473	106.7912

Framework versions

PEFT 0.12.0
Transformers 4.44.2
Pytorch 2.4.1+cu121
Datasets 3.0.0
Tokenizers 0.19.1