A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation Paper • 2409.17550 • Published Sep 26
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper • 2412.15322 • Published 6 days ago • 15
Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis Paper • 2412.15322 • Published 6 days ago • 15
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network Paper • 2309.02836 • Published Sep 6, 2023
GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping Paper • 2405.17251 • Published May 27 • 2
SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer Paper • 2301.12811 • Published Jan 30, 2023
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation Paper • 2405.14598 • Published May 23 • 11
SoundCTM: Uniting Score-based and Consistency Models for Text-to-Sound Generation Paper • 2405.18503 • Published May 28 • 9