Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings Paper • 2305.10786 • Published May 18, 2023
MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation Paper • 2312.11825 • Published Dec 19, 2023
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction Paper • 2501.06282 • Published Jan 10 • 48
HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution Paper • 2501.10045 • Published Jan 17 • 9
InspireMusic: Integrating Super Resolution and Large Language Model for High-Fidelity Long-Form Music Generation Paper • 2503.00084 • Published 21 days ago