20 55 26

HAODONG DUAN

KennyUTC

https://kennymckormick.github.io

AI & ML interests

Video Understanding; Multi-Modal Learning

Recent Activity

authored a paper 1 day ago

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

upvoted a paper 1 day ago

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

commented on a paper 1 day ago

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

View all activity

Organizations

KennyUTC's activity

authored a paper 1 day ago

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

Paper • 2504.02826 • Published 1 day ago • 55

upvoted a paper 1 day ago

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

Paper • 2504.02826 • Published 1 day ago • 55

commented a paper 1 day ago

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

Paper • 2504.02826 • Published 1 day ago • 55 •

authored a paper 9 days ago

LEGO-Puzzles: How Good Are MLLMs at Multi-Step Spatial Reasoning?

Paper • 2503.19990 • Published 11 days ago • 31

upvoted a paper 9 days ago

LEGO-Puzzles: How Good Are MLLMs at Multi-Step Spatial Reasoning?

Paper • 2503.19990 • Published 11 days ago • 31

commented a paper 9 days ago

LEGO-Puzzles: How Good Are MLLMs at Multi-Step Spatial Reasoning?

Paper • 2503.19990 • Published 11 days ago • 31 •

liked 3 datasets 10 days ago

authored a paper 17 days ago

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

Paper • 2503.14478 • Published 18 days ago • 42

upvoted a paper 17 days ago

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

Paper • 2503.14478 • Published 18 days ago • 42

upvoted a paper 22 days ago

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Paper • 2503.10291 • Published 23 days ago • 32

authored a paper 22 days ago

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Paper • 2503.10291 • Published 23 days ago • 32

upvoted a paper 26 days ago

Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published 29 days ago • 112

authored a paper about 1 month ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 74

upvoted a paper about 1 month ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 74

authored a paper about 1 month ago

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published Feb 25 • 71

upvoted a paper about 1 month ago

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published Feb 25 • 71

updated a Space about 1 month ago

Open LMM Reasoning Leaderboard

🥇

A Leaderboard that demonstrates LMM reasoning capabilities

upvoted a paper about 1 month ago

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

Paper • 2502.13128 • Published Feb 18 • 40