Whisper Small ATC - ATCText

This model is a fine-tuned version of openai/whisper-small on the ATC dataset. It achieves the following results on the evaluation set:

Loss: 0.2486
Wer: 10.6129

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 4000

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.2533	0.42	1000	0.3465	16.2868
0.235	0.84	2000	0.2881	13.5237
0.0851	1.27	3000	0.2607	10.6048
0.1317	1.69	4000	0.2486	10.6129

Framework versions

Transformers 4.39.3
Pytorch 2.2.2
Datasets 2.18.0
Tokenizers 0.15.2

Additional Information

Licensing Information

The licensing status of the dataset hinges on the legal status of the UWB-ATCC corpus creators.

They used Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) licensing.

Citation Information

Contributors who prepared, processed, normalized and uploaded the dataset in HuggingFace:

@article{zuluaga2022how, title={How Does Pre-trained Wav2Vec2. 0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications}, author={Zuluaga-Gomez, Juan and Prasad, Amrutha and Nigmatulina, Iuliia and Sarfjoo, Saeed and others}, journal={IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar}, year={2022} }

@article{zuluaga2022bertraffic, title={BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications}, author={Zuluaga-Gomez, Juan and Sarfjoo, Seyyed Saeed and Prasad, Amrutha and others}, journal={IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar}, year={2022} }

@article{zuluaga2022atco2, title={ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications}, author={Zuluaga-Gomez, Juan and Vesel{`y}, Karel and Sz{"o}ke, Igor and Motlicek, Petr and others}, journal={arXiv preprint arXiv:2211.04054}, year={2022} }

Authors of the dataset:

@article{vsmidl2019air, title={Air traffic control communication (ATCC) speech corpora and their use for ASR and TTS development}, author={{\v{S}}m{'\i}dl, Lubo{\v{s}} and {\v{S}}vec, Jan and Tihelka, Daniel and Matou{\v{s}}ek, Jind{\v{r}}ich and Romportl, Jan and Ircing, Pavel}, journal={Language Resources and Evaluation}, volume={53}, number={3}, pages={449--464}, year={2019}, publisher={Springer} }

san2003m
/

whisper-small-atc

Whisper Small ATC - ATCText

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Additional Information

Licensing Information

Citation Information

Authors of the dataset:

Model tree for san2003m/whisper-small-atc

Dataset used to train san2003m/whisper-small-atc

Evaluation results