Xuenan Xu's picture

9 1

Xuenan Xu

wsntxxn

·

https://wsntxxn.github.io

AI & ML interests

Text to Speech Synthesis Text to Music Synthesis Singing Voice Synthesis

Recent Activity

new activity about 5 hours ago

wsntxxn/cnn8rnn-w2vmean-audiocaps-grounding:Training Repository & AudioCaps2.0

updated a Space 17 days ago

wsntxxn/MM-StoryAgent

new activity 17 days ago

wsntxxn/MM-StoryAgent:你好测试两次每次都看到图像出来了然后马上显示图像错误再也看不到也下载不了图像

View all activity

Organizations

None yet

Papers 10

arxiv:2407.14329

arxiv:2407.02869

arxiv:2407.02857

arxiv:2406.08052

spaces 2

MM StoryAgent

Generate a storytelling video from a topic and scene

Efficient Audio Captioning

models 7

wsntxxn/cnn8rnn-audioset-sed

Audio Classification • Updated Dec 30, 2024 • 365 • 3

wsntxxn/cnn14rnn-tempgru-audiocaps-captioning

Feature Extraction • Updated Dec 27, 2024 • 61 • 1

wsntxxn/effb2-trm-audiocaps-captioning

Feature Extraction • Updated Dec 20, 2024 • 59 • 1

wsntxxn/effb2-trm-clotho-captioning

Feature Extraction • Updated Dec 17, 2024 • 85 • 1

wsntxxn/cnn8rnn-w2vmean-audiocaps-grounding

Audio Classification • Updated Aug 19, 2024 • 212 • 2

wsntxxn/audiocaps-simple-tokenizer

Updated Jun 19, 2024

wsntxxn/clotho-simple-tokenizer

Updated Jun 19, 2024

datasets

None public yet