7 116 56

Rui Zhao

ruizhaocv

https://ruizhaocv.github.io/

AI & ML interests

Multimodal and GenAI

Recent Activity

upvoted a paper 10 days ago

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

upvoted a paper 15 days ago

XAttention: Block Sparse Attention with Antidiagonal Scoring

liked a Space 15 days ago

tencent/Hunyuan3D-2mini-Turbo

View all activity

Organizations

ruizhaocv's activity

upvoted a paper 10 days ago

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published 11 days ago • 70

upvoted a paper 15 days ago

XAttention: Block Sparse Attention with Antidiagonal Scoring

Paper • 2503.16428 • Published 15 days ago • 12

liked a Space 15 days ago

Hunyuan3D 2mini Turbo

🔥

Fast Images-to-3D Generation within 1 Second

upvoted 2 papers 15 days ago

MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance

Paper • 2503.16421 • Published 15 days ago • 9

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

Paper • 2503.16365 • Published 16 days ago • 35

upvoted a paper 17 days ago

Impossible Videos

Paper • 2503.14378 • Published 18 days ago • 57

upvoted 2 papers 19 days ago

FlowTok: Flowing Seamlessly Across Text and Image Tokens

Paper • 2503.10772 • Published 22 days ago • 18

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published 21 days ago • 128

upvoted a paper 22 days ago

CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance

Paper • 2503.10391 • Published 23 days ago • 10

upvoted 2 papers 23 days ago

TPDiff: Temporal Pyramid Video Diffusion Model

Paper • 2503.09566 • Published 24 days ago • 43

Tuning-Free Multi-Event Long Video Generation via Synchronized Coupled Sampling

Paper • 2503.08605 • Published 25 days ago • 26

liked a model 25 days ago

tencent/HunyuanVideo-I2V

Image-to-Video • Updated 23 days ago • 4.55k • 286

upvoted a paper 25 days ago

Automated Movie Generation via Multi-Agent CoT Planning

Paper • 2503.07314 • Published 26 days ago • 42

upvoted a paper 30 days ago

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers

Paper • 2503.00865 • Published Mar 2 • 61

authored a paper 30 days ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Paper • 2503.03651 • Published about 1 month ago • 16

upvoted a paper about 1 month ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Paper • 2503.03651 • Published about 1 month ago • 16

commented a paper about 1 month ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Paper • 2503.03651 • Published about 1 month ago • 16 •

upvoted 2 papers about 1 month ago

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Paper • 2503.01774 • Published Mar 3 • 41

PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data

Paper • 2502.14397 • Published Feb 20 • 40

liked a model about 1 month ago

Comfy-Org/HunyuanVideo_repackaged

Updated 27 days ago • 168