Zhisheng Zheng

zhisheng01

https://zhishengzheng.com/

zhisheng147

AI & ML interests

LLM, Speech and Audio Processing

Recent Activity

liked a model 25 days ago

Qwen/Qwen2.5-Omni-7B

liked a dataset about 1 month ago

laion/laions_got_talent

upvoted a paper about 1 month ago

Charting and Navigating Hugging Face's Model Atlas

View all activity

Organizations

None yet

zhisheng01's activity

liked a model 25 days ago

Qwen/Qwen2.5-Omni-7B

Any-to-Any • Updated 6 days ago • 167k • 1.45k

liked a dataset about 1 month ago

laion/laions_got_talent

Viewer • Updated Jan 5 • 461k • 1.23k • 25

upvoted 2 papers about 1 month ago

Charting and Navigating Hugging Face's Model Atlas

Paper • 2503.10633 • Published Mar 13 • 76

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Paper • 2503.04724 • Published Mar 6 • 69

upvoted 2 papers about 2 months ago

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published Feb 25 • 73

Slamming: Training a Speech Language Model on One GPU in a Day

Paper • 2502.15814 • Published Feb 19 • 69

upvoted a paper 2 months ago

Soundwave: Less is More for Speech-Text Alignment in LLMs

Paper • 2502.12900 • Published Feb 18 • 84

liked a dataset 2 months ago

baijs/AudioSetCaps

Preview • Updated Nov 27, 2024 • 123 • 20

liked 2 models 2 months ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Text Generation • Updated Feb 24 • 1.7M • • 1.17k

deepseek-ai/DeepSeek-R1

Text Generation • Updated 25 days ago • 1.73M • • 12k

upvoted 2 papers 2 months ago

AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting

Paper • 2502.05176 • Published Feb 7 • 36

Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis

Paper • 2502.04128 • Published Feb 6 • 25

liked a dataset 2 months ago

CAiRE/ASCEND

Viewer • Updated Jul 16, 2024 • 12.3k • 369 • 34

upvoted a paper 3 months ago

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Paper • 2501.06282 • Published Jan 10 • 51

liked a model 4 months ago

deepseek-ai/DeepSeek-V3

Text Generation • Updated 25 days ago • 749k • • 3.81k

liked 2 models 5 months ago

nyrahealth/CrisperWhisper

Automatic Speech Recognition • Updated Dec 19, 2024 • 10.2k • 269

kyutai/mimi

Feature Extraction • Updated Sep 18, 2024 • 225k • 192

liked a dataset 6 months ago

walkerhyf/NCSSD

Updated Nov 12, 2024 • 79 • 20

upvoted a paper 6 months ago

Movie Gen: A Cast of Media Foundation Models

Paper • 2410.13720 • Published Oct 17, 2024 • 98