Zeyi Sun's picture

Zeyi Sun

Zery

·

https://github.com/SunzeY

AI & ML interests

CV

Recent Activity

upvoted a paper 10 days ago

Kimi-VL Technical Report

upvoted a paper 10 days ago

MM-IFEngine: Towards Multimodal Instruction Following

upvoted a paper 11 days ago

GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography

View all activity

Organizations

None yet

Zery's activity

upvoted 2 papers 10 days ago

Kimi-VL Technical Report

Paper • 2504.07491 • Published 11 days ago • 114

MM-IFEngine: Towards Multimodal Instruction Following

Paper • 2504.07957 • Published 10 days ago • 33

upvoted a paper 11 days ago

GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography

Paper • 2504.07083 • Published 11 days ago • 22

updated a model 19 days ago

Zery/Qwen2-VL-7B_visual_rft_lisa_IoU_reward

Image-Text-to-Text • Updated 19 days ago • 788 • 4

updated a collection 19 days ago

RelightVid

3 items • Updated 19 days ago • 1

upvoted a collection 19 days ago

RelightVid

3 items • Updated 19 days ago • 1

updated a collection 19 days ago

RelightVid

3 items • Updated 19 days ago • 1

liked a model 29 days ago

Zery/Qwen2-VL-7B_visual_rft_lisa_IoU_reward

Image-Text-to-Text • Updated 19 days ago • 788 • 4

published a model about 1 month ago

Zery/Qwen2-VL-7B_visual_rft_lisa_IoU_reward

Image-Text-to-Text • Updated 19 days ago • 788 • 4

upvoted a paper about 1 month ago

Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7 • 118

upvoted a collection about 1 month ago

ViRFT Datasets

ViRFT Datasets • 8 items • Updated Feb 24 • 8

liked a Space about 2 months ago

RelightVid

Generate relit videos from foreground and background inputs

authored a paper about 2 months ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 76

upvoted a paper about 2 months ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 76

commented a paper about 2 months ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 76 •

upvoted 2 papers about 2 months ago

RelightVid: Temporal-Consistent Diffusion Model for Video Relighting

Paper • 2501.16330 • Published Jan 27 • 2

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published Feb 25 • 73

authored a paper about 2 months ago

RelightVid: Temporal-Consistent Diffusion Model for Video Relighting

Paper • 2501.16330 • Published Jan 27 • 2

liked a Space about 2 months ago

OmniParser V2

OmniParser, turn your LLM into GUI agent