metadata
language: vi
datasets:
- youtube-vi-13k-hours
tags:
- speech
license: cc-by-nc-4.0
Vietnamese Wav2Vec2-Large model
Our self-supervised model is pre-trained on a massive audio set of 13k hours of Vietnamese youtube audio.
Usage
Since our model has the same architecture as the English wav2vec2 version, you can use this notebook for more information on how to fine-tune the model.