T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining Paper • 2404.17806 • Published Apr 27, 2024
AudioTime: A Temporally-aligned Audio-text Benchmark Dataset Paper • 2407.02857 • Published Jul 3, 2024
Zero-Shot Audio Captioning Using Soft and Hard Prompts Paper • 2406.06295 • Published Jun 10, 2024
Enhance Temporal Relations in Audio Captioning with Sound Event Detection Paper • 2306.01533 • Published Jun 2, 2023
wsntxxn/cnn14rnn-tempgru-audiocaps-captioning Feature Extraction • Updated 16 days ago • 169 • 1
wsntxxn/cnn8rnn-w2vmean-audiocaps-grounding Audio Classification • Updated Aug 19, 2024 • 111 • 2
Efficient Audio Captioning with Encoder-Level Knowledge Distillation Paper • 2407.14329 • Published Jul 19, 2024 • 5
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation Paper • 2407.02869 • Published Jul 3, 2024 • 18 • 5
A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds Paper • 2403.04594 • Published Mar 7, 2024
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation Paper • 2407.02869 • Published Jul 3, 2024 • 18