johntsi/ZeroSwot-Medium_asr-cv_mt-covost2_en-to-15 Automatic Speech Recognition • Updated Aug 17, 2024 • 13
johntsi/ZeroSwot-Medium_asr-mustc_mt-mustc_en-to-8 Automatic Speech Recognition • Updated Aug 17, 2024 • 11
johntsi/ZeroSwot-Large_asr-mustc_mt-mustc_en-to-8 Automatic Speech Recognition • Updated Aug 17, 2024 • 11
johntsi/ZeroSwot-Large_asr-cv_mt-covost2_en-to-15 Automatic Speech Recognition • Updated Aug 17, 2024 • 11
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity Paper • 2407.10387 • Published Jul 15, 2024 • 6
Pushing the Limits of Zero-shot End-to-End Speech Translation Paper • 2402.10422 • Published Feb 16, 2024
Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT23 Paper • 2306.01327 • Published Jun 2, 2023
Explaining How Transformers Use Context to Build Predictions Paper • 2305.12535 • Published May 21, 2023
SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations Paper • 2212.09699 • Published Dec 19, 2022
Efficient Speech Translation with Dynamic Latent Perceivers Paper • 2210.16264 • Published Oct 28, 2022