WhisperX: Time-Accurate Speech Transcription of Long-Form Audio Paper • 2303.00747 • Published Mar 1, 2023 • 4
Custom Data Augmentation for low resource ASR using Bark and Retrieval-Based Voice Conversion Paper • 2311.14836 • Published Nov 24, 2023 • 2
SONAR: Sentence-Level Multimodal and Language-Agnostic Representations Paper • 2308.11466 • Published Aug 22, 2023 • 1
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training Paper • 2108.06209 • Published Aug 7, 2021 • 1
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations Paper • 2006.11477 • Published Jun 20, 2020 • 5