Zesen Cheng's picture

Zesen Cheng

ClownRat

·

AI & ML interests

multi-modal foundation model; Segmentation, Detection, and Tracking;

Recent Activity

upvoted a paper 14 days ago

MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation

authored a paper 14 days ago

MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation

upvoted a paper 19 days ago

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

View all activity

Organizations

Collections 1

Papers 15

arxiv:2503.14428

arxiv:2502.13923

arxiv:2501.13106

arxiv:2501.00599

models 5

ClownRat/VideoLLaMA2.1-7B-16F

Text Generation • Updated Jan 6 • 14

ClownRat/resnet-50-torchvision

Updated Dec 25, 2024 • 5

ClownRat/mask2former-resnet-50-coco-instance

Updated Dec 25, 2024 • 57

ClownRat/resnet-101-torchvision

Updated Dec 23, 2024 • 7

ClownRat/mask2former-resnet-101-coco-instance

Updated Dec 17, 2024 • 18

datasets 2

ClownRat/YoutubeVIS-2019

Updated Jan 26 • 27

ClownRat/COCO2017-Instance

Viewer • Updated Dec 11, 2024 • 123k • 51 • 1