whisper-large-v3 / README.md
fanaf91318's picture
aisha-org/whisper-large-v3_v1
8f16352 verified
---
library_name: transformers
license: apache-2.0
base_model: openai/whisper-medium
tags:
- generated_from_trainer
metrics:
- wer
model-index:
- name: whisper-large-v3
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# whisper-large-v3
This model is a fine-tuned version of [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1965
- Wer Ortho: 18.1002
- Wer: 15.9525
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 64
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant_with_warmup
- lr_scheduler_warmup_steps: 50
- training_steps: 10000
- mixed_precision_training: Native AMP
### Training results
| Training Loss | Epoch | Step | Validation Loss | Wer Ortho | Wer |
|:-------------:|:------:|:-----:|:---------------:|:---------:|:--------:|
| 0.6288 | 0.0952 | 500 | 0.6102 | 55.9280 | 60.8769 |
| 0.5324 | 0.1904 | 1000 | 0.5052 | 39.4707 | 42.5276 |
| 0.4501 | 0.2856 | 1500 | 0.4515 | 61.0459 | 54.8192 |
| 0.4097 | 0.3807 | 2000 | 0.4170 | 55.1628 | 61.1920 |
| 0.3907 | 0.4759 | 2500 | 0.3918 | 32.0076 | 28.6487 |
| 0.3647 | 0.5711 | 3000 | 0.3704 | 63.9223 | 100.4724 |
| 0.3832 | 0.6663 | 3500 | 0.3503 | 28.5079 | 24.8599 |
| 0.3584 | 0.7615 | 4000 | 0.3356 | 25.4798 | 21.6963 |
| 0.3358 | 0.8567 | 4500 | 0.3208 | 30.3739 | 23.8063 |
| 0.3157 | 0.9518 | 5000 | 0.3068 | 30.6595 | 24.0364 |
| 0.2682 | 1.0470 | 5500 | 0.2945 | 28.6989 | 31.7195 |
| 0.2809 | 1.1422 | 6000 | 0.2834 | 40.9943 | 42.9384 |
| 0.264 | 1.2374 | 6500 | 0.2726 | 21.4030 | 17.7449 |
| 0.231 | 1.3326 | 7000 | 0.2626 | 20.2943 | 16.7944 |
| 0.2162 | 1.4278 | 7500 | 0.2502 | 21.4164 | 18.6420 |
| 0.2581 | 1.5229 | 8000 | 0.2375 | 18.9646 | 20.5258 |
| 0.2395 | 1.6181 | 8500 | 0.2282 | 21.2771 | 17.5843 |
| 0.1951 | 1.7133 | 9000 | 0.2185 | 19.0834 | 15.9387 |
| 0.1733 | 1.8085 | 9500 | 0.2086 | 19.9144 | 18.8285 |
| 0.1896 | 1.9037 | 10000 | 0.1965 | 18.1002 | 15.9525 |
### Framework versions
- Transformers 4.44.2
- Pytorch 2.4.1+cu121
- Datasets 3.0.0
- Tokenizers 0.19.1