librarian-bot's picture
Librarian Bot: Add base_model information to model
6e9bb59
|
raw
history blame
3.44 kB
metadata
language:
  - it
license: apache-2.0
tags:
  - audio
  - automatic-speech-recognition
  - hf-asr-leaderboard
  - it
  - mozilla-foundation/common_voice_8_0
  - speech
  - wav2vec2
datasets:
  - mozilla-foundation/common_voice_8_0
metrics:
  - wer
  - cer
base_model: facebook/wav2vec2-xls-r-1b
model-index:
  - name: XLS-R Wav2Vec2 Italian by radiogroup crits
    results:
      - task:
          type: automatic-speech-recognition
          name: Speech Recognition
        dataset:
          name: Common Voice 8.0 italian
          type: mozilla-foundation/common_voice_8_0
          args: it
        metrics:
          - type: wer
            value: 9.04
            name: Test WER
          - type: cer
            value: 2.2
            name: Test CER
          - type: wer
            value: 6.24
            name: Test WER (+LM)
          - type: cer
            value: 1.67
            name: Test CER (+LM)

XLS-R-1B-ITALIAN-DOC4LM-5GRAM

Fine-tuned XLS-R 1B model for speech recognition in Italian

Fine-tuned facebook/wav2vec2-xls-r-1b on Italian using the train and validation splits of Common Voice 8.0, Multilingual TEDx, Multilingual LibriSpeech, and Voxpopuli.

When using this model, make sure that your speech input is sampled at 16kHz.

Language model information

Our language model was generated using a dataset of Italian wikipedia articles and manual transcriptions of radio newspapers and television programs.

Download CommonVoice8.0 dataset for italian language

from datasets import load_dataset

dataset = load_dataset("mozilla-foundation/common_voice_8_0", "it", use_auth_token=True)

Evaluation Commands

To evaluate on mozilla-foundation/common_voice_8_0 with split test:

python eval.py --model_id radiogroup-crits/wav2vec2-xls-r-1b-italian-doc4lm-5gram --dataset mozilla-foundation/common_voice_8_0 --config it --split test --log_outputs --greedy

mv log_mozilla-foundation_common_voice_8_0_it_test_predictions.txt log_mozilla-foundation_common_voice_8_0_it_test_predictions_greedy.txt

mv log_mozilla-foundation_common_voice_8_0_it_test_targets.txt log_mozilla-foundation_common_voice_8_0_it_test_targets_greedy.txt

mv mozilla-foundation_common_voice_8_0_it_test_eval_results.txt mozilla-foundation_common_voice_8_0_it_test_eval_results_greedy.txt

python eval.py --model_id radiogroup-crits/wav2vec2-xls-r-1b-italian-doc4lm-5gram --dataset mozilla-foundation/common_voice_8_0 --config it --split test --log_outputs

mv log_mozilla-foundation_common_voice_8_0_it_test_predictions.txt log_mozilla-foundation_common_voice_8_0_it_test_predictions_lm.txt

mv log_mozilla-foundation_common_voice_8_0_it_test_targets.txt log_mozilla-foundation_common_voice_8_0_it_test_targets_lm.txt

mv mozilla-foundation_common_voice_8_0_it_test_eval_results.txt mozilla-foundation_common_voice_8_0_it_test_eval_results_lm.txt

Citation

If you want to cite this model you can use this:

@misc{crits2022wav2vec2-xls-r-1b-italian-doc4lm-5gram,
  title={XLS-R Wav2Vec2 Italian by radiogroup crits},
  author={Teraoni Prioletti Raffaele, Casagranda Paolo and Russo Francesco},
  publisher={Hugging Face},
  journal={Hugging Face Hub},
  howpublished={\url{https://huggingface.co/radiogroup-crits/wav2vec2-xls-r-1b-italian-doc4lm-5gram}},
  year={2022}
}