techiaith
/

whisper-large-v3-ft-btb-cv-cy

@@ -3,66 +3,42 @@ license: apache-2.0
 base_model: openai/whisper-large-v3
 tags:
 - generated_from_trainer
 metrics:
 - wer
 model-index:
 - name: whisper-large-v3-ft-btb-cv-cy
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # whisper-large-v3-ft-btb-cv-cy
-This model is a fine-tuned version of [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) on the DewiBrynJones/banc-trawsgrifiadau-bangor-clean train main, DewiBrynJones/commonvoice_18_0_cy train+dev+test main dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.3838
-- Wer: 0.2732
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 1e-05
-- train_batch_size: 16
-- eval_batch_size: 16
-- seed: 42
-- gradient_accumulation_steps: 2
-- total_train_batch_size: 32
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 500
-- training_steps: 5000
-- mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch  | Step | Validation Loss | Wer    |
-|:-------------:|:------:|:----:|:---------------:|:------:|
-| 0.4047        | 0.5711 | 1000 | 0.4849          | 0.3505 |
-| 0.2476        | 1.1422 | 2000 | 0.4187          | 0.3137 |
-| 0.2527        | 1.7133 | 3000 | 0.3882          | 0.2901 |
-| 0.1568        | 2.2844 | 4000 | 0.3902          | 0.2816 |
-| 0.1313        | 2.8555 | 5000 | 0.3838          | 0.2732 |
-### Framework versions
-- Transformers 4.44.0
-- Pytorch 2.4.0+cu121
-- Datasets 2.20.0
-- Tokenizers 0.19.1

 base_model: openai/whisper-large-v3
 tags:
 - generated_from_trainer
+- verbatim
 metrics:
 - wer
 model-index:
 - name: whisper-large-v3-ft-btb-cv-cy
   results: []
+datasets:
+- techiaith/banc-trawsgrifiadau-bangor
+- techiaith/commonvoice_18_0_cy
+language:
+- cy
+pipeline_tag: automatic-speech-recognition
 ---
 # whisper-large-v3-ft-btb-cv-cy
+This model is a version of [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) finedtuned with
+transcriptions of Welsh language spontaneous speech [Banc Trawsgrifiadau Bangor (btb)](https://huggingface.co/datasets/techiaith/banc-trawsgrifiadau-bangor)
+ac well as recordings of read speach from [Welsh Common Voice version 18 (cv)](https://huggingface.co/datasets/techiaith/commonvoice_18_0_cy)
+for additional training.
+As such this model is suitable for more verbatim transcribing of spontaneous or unplanned speech.
+It achieves the following results on the [Banc Trawsgrifiadau Bangor'r test set](https://huggingface.co/datasets/techiaith/banc-trawsgrifiadau-bangor/viewer/default/test)
+- WER: 29.72
+- CER: 11.01
+## Usage
+```python
+from transformers import pipeline
+transcriber = pipeline("automatic-speech-recognition", model="techiaith/whisper-large-v3-ft-btb-cv-cy")
+result = transcriber(<path or url to soundfile>)
+print (result)
+```
+`{'text': 'ymm, yn y pum mlynadd dwitha 'ma ti 'di... Ie. ...bod drw dipyn felly do?'}`