metadata

language: vi
datasets:
  - youtube-vi-13k-hours
tags:
  - speech
license: cc-by-nc-4.0

Vietnamese Wav2Vec2-Large model

Our self-supervised model is pre-trained on a massive audio set of 13k hours of Vietnamese youtube audio.

Usage

Since our model has the same architecture as the English wav2vec2 version, you can use this notebook for more information on how to fine-tune the model.

Contact

[email protected] / [email protected]