metadata
language:
- it
license: apache-2.0
datasets:
- mozilla-foundation/common_voice_11_0
metrics:
- wer
- cer
tags:
- audio
- automatic-speech-recognition
- hf-asr-leaderboard
- it
- mozilla-foundation/common_voice_11_0
- speech
- wav2vec2
model-index:
- name: XLS-R Wav2Vec2 CV11Ita by radiogroup crits
results:
- task:
name: Speech Recognition
type: automatic-speech-recognition
dataset:
name: Common Voice 11.0 italian
type: mozilla-foundation/common_voice_11_0
args: it
metrics:
- name: Test WER
type: wer
value: 7.12
- name: Test CER
type: cer
value: 1.75
- name: Test WER (+LM)
type: wer
value: 5.77
- name: Test CER (+LM)
type: cer
value: 1.51
XLS-R-1B-CV11ITA-LMWIKI500
Fine-tuned XLS-R 1B model for speech recognition in Italian
Fine-tuned facebook/wav2vec2-xls-r-1b on Italian using the train and validation splits of Common Voice 11.0.
When using this model, make sure that your speech input is sampled at 16kHz.
Language model information
Our language model was generated using a 500-characters data set for each Italian Wikipedia article.
Download CommonVoice11.0 dataset for italian language
from datasets import load_dataset
dataset = load_dataset("mozilla-foundation/common_voice_11_0", "it", use_auth_token=True)
Evaluation Commands
To evaluate on mozilla-foundation/common_voice_11_0
with split test
:
python eval.py --model_id radiogroup-crits/wav2vec2-xls-r-1b-cv11ita-lmwiki500 --dataset mozilla-foundation/common_voice_11_0 --config it --split test --log_outputs --greedy
mv log_mozilla-foundation_common_voice_11_0_it_test_predictions.txt log_mozilla-foundation_common_voice_11_0_it_test_predictions_greedy.txt
mv log_mozilla-foundation_common_voice_11_0_it_test_targets.txt log_mozilla-foundation_common_voice_11_0_it_test_targets_greedy.txt
mv mozilla-foundation_common_voice_11_0_it_test_eval_results.txt mozilla-foundation_common_voice_11_0_it_test_eval_results_greedy.txt
python eval.py --model_id radiogroup-crits/wav2vec2-xls-r-1b-cv11ita-lmwiki500 --dataset mozilla-foundation/common_voice_11_0 --config it --split test --log_outputs
mv log_mozilla-foundation_common_voice_11_0_it_test_predictions.txt log_mozilla-foundation_common_voice_11_0_it_test_predictions_lm.txt
mv log_mozilla-foundation_common_voice_11_0_it_test_targets.txt log_mozilla-foundation_common_voice_11_0_it_test_targets_lm.txt
mv mozilla-foundation_common_voice_11_0_it_test_eval_results.txt mozilla-foundation_common_voice_11_0_it_test_eval_results_lm.txt
Citation
If you want to cite this model you can use this:
@misc{crits2023wav2vec2-xls-r-1b-cv11ita-lmwiki500,
title={XLS-R Wav2Vec2 CV11Ita by radiogroup crits},
author={Teraoni Prioletti Raffaele, Casagranda Paolo and Russo Francesco},
publisher={Hugging Face},
journal={Hugging Face Hub},
howpublished={\url{https://huggingface.co/radiogroup-crits/wav2vec2-xls-r-1b-cv11ita-lmwiki500}},
year={2023}
}