facebook
/

wav2vec2-large-robust

Model card Files Files and versions

patrickvonplaten commited on Nov 5, 2021

Commit

2493a2c

·

1 Parent(s): e5b789b

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -21,7 +21,9 @@ Speech datasets from multiple domains were used to pretrain the model:
 - [Switchboard](https://catalog.ldc.upenn.edu/LDC97S62): telephone speech corpus; noisy telephone data
 - [Fisher](https://catalog.ldc.upenn.edu/LDC2004T19): conversational telephone speech; noisy telephone data
-When using the model make sure that your speech input is also sampled at 16Khz. Note that this model should be fine-tuned on a downstream task, like Automatic Speech Recognition. Check out [this blog](https://huggingface.co/blog/fine-tune-wav2vec2-english) for more information.
 [Paper Robust Wav2Vec2](https://arxiv.org/abs/2104.01027)

 - [Switchboard](https://catalog.ldc.upenn.edu/LDC97S62): telephone speech corpus; noisy telephone data
 - [Fisher](https://catalog.ldc.upenn.edu/LDC2004T19): conversational telephone speech; noisy telephone data
+When using the model make sure that your speech input is also sampled at 16Khz.
+**Note**: This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model **speech recognition**, a tokenizer should be created and the model should be fine-tuned on labeled text data. Check out [this blog](https://huggingface.co/blog/fine-tune-wav2vec2-english) for more in-detail explanation of how to fine-tune the model.
 [Paper Robust Wav2Vec2](https://arxiv.org/abs/2104.01027)