patrickvonplaten commited on
Commit
2493a2c
1 Parent(s): e5b789b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -21,7 +21,9 @@ Speech datasets from multiple domains were used to pretrain the model:
21
  - [Switchboard](https://catalog.ldc.upenn.edu/LDC97S62): telephone speech corpus; noisy telephone data
22
  - [Fisher](https://catalog.ldc.upenn.edu/LDC2004T19): conversational telephone speech; noisy telephone data
23
 
24
- When using the model make sure that your speech input is also sampled at 16Khz. Note that this model should be fine-tuned on a downstream task, like Automatic Speech Recognition. Check out [this blog](https://huggingface.co/blog/fine-tune-wav2vec2-english) for more information.
 
 
25
 
26
  [Paper Robust Wav2Vec2](https://arxiv.org/abs/2104.01027)
27
 
 
21
  - [Switchboard](https://catalog.ldc.upenn.edu/LDC97S62): telephone speech corpus; noisy telephone data
22
  - [Fisher](https://catalog.ldc.upenn.edu/LDC2004T19): conversational telephone speech; noisy telephone data
23
 
24
+ When using the model make sure that your speech input is also sampled at 16Khz.
25
+
26
+ **Note**: This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model **speech recognition**, a tokenizer should be created and the model should be fine-tuned on labeled text data. Check out [this blog](https://huggingface.co/blog/fine-tune-wav2vec2-english) for more in-detail explanation of how to fine-tune the model.
27
 
28
  [Paper Robust Wav2Vec2](https://arxiv.org/abs/2104.01027)
29