T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining Paper • 2404.17806 • Published Apr 27, 2024
AudioTime: A Temporally-aligned Audio-text Benchmark Dataset Paper • 2407.02857 • Published Jul 3, 2024
Enhance Temporal Relations in Audio Captioning with Sound Event Detection Paper • 2306.01533 • Published Jun 2, 2023
Efficient Audio Captioning with Encoder-Level Knowledge Distillation Paper • 2407.14329 • Published Jul 19, 2024 • 5