metadata

library_name: transformers
language:
  - multilingual
license: apache-2.0
base_model: openai/whisper-tiny.en
tags:
  - generated_from_trainer
datasets:
  - edutjie/bisix_su_id
metrics:
  - wer
model-index:
  - name: 'BisiX: Sundanese Whisper'
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: SU ID ASR
          type: edutjie/bisix_su_id
          config: su_id_asr_source
          split: validation
          args: su_id_asr_source
        metrics:
          - name: Wer
            type: wer
            value: 33.87865168539326

BisiX: Sundanese Whisper

This model is a fine-tuned version of openai/whisper-tiny.en on the SU ID ASR dataset. It achieves the following results on the evaluation set:

Loss: 1.0180
Wer: 33.8787
Cer: 11.6897

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 30
training_steps: 150
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
4.3455	0.1765	30	2.4772	85.1326	33.9863
1.7093	0.3529	60	1.3486	41.4562	15.2167
1.2183	0.5294	90	1.1469	36.2247	12.5208
1.0676	0.7059	120	1.0517	34.6427	11.9084
0.9974	0.8824	150	1.0180	33.8787	11.6897

Framework versions

Transformers 4.44.2
Pytorch 2.4.1+cu121
Datasets 3.0.1
Tokenizers 0.19.1