PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation Paper • 2407.02869 • Published Jul 3 • 18
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining Paper • 2308.05734 • Published Aug 10, 2023 • 36
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation Paper • 2211.06687 • Published Nov 12, 2022 • 3