MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models Paper • 2310.11954 • Published Oct 18, 2023 • 25
UniAudio: An Audio Foundation Model Toward Universal Audio Generation Paper • 2310.00704 • Published Oct 1, 2023 • 21
E3 TTS: Easy End-to-End Diffusion-based Text to Speech Paper • 2311.00945 • Published Nov 2, 2023 • 14
In-Context Prompt Editing For Conditional Audio Generation Paper • 2311.00895 • Published Nov 1, 2023 • 10
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis Paper • 2312.03491 • Published Dec 6, 2023 • 33
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation Paper • 2407.02869 • Published Jul 3 • 18
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs Paper • 2407.04051 • Published Jul 4 • 35
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation Paper • 2407.15060 • Published Jul 21 • 9