NbAiLab
/

whisper-small-3NRKonly-nob

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

Whisper Small Norwegian Bokmål

This model is a fine-tuned version of openai/whisper-small trained on NCC_S_3-NRKonly.

It is currently in the middle of a large training.

Model description

The model is trained on a large corpus of roughly 4.000 hours of voice. The sources are subtitles from the Norwegian broadcaster NRK.

Intended uses & limitations

The model will be free for everyone to use when it is finished.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-06
train_batch_size: 128
gradient_accumulation_steps: 2
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant with warmup
lr_scheduler_warmup_steps: 1000
training_steps: 50.000 (currently @1.000)
mixed_precision_training: fp16
deepspeed: true

Live Training results

See Tensorboad Metrics

Downloads last month: 9

Inference Examples

Automatic Speech Recognition

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train NbAiLab/whisper-small-3NRKonly-nob

Evaluation results

Wer on FLEURS
validation set self-reported

15.560

View on Papers With Code