1 18 3

Azusa Aisaka

Bujiazi

AI & ML interests

None yet

Recent Activity

commented on a paper 4 days ago

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

upvoted a paper 10 days ago

MM-IFEngine: Towards Multimodal Instruction Following

upvoted a paper 10 days ago

VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning

View all activity

Organizations

None yet

Bujiazi's activity

commented a paper 4 days ago

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

Paper • 2504.06232 • Published 13 days ago • 11 •

upvoted 2 papers 10 days ago

MM-IFEngine: Towards Multimodal Instruction Following

Paper • 2504.07957 • Published 11 days ago • 33

VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning

Paper • 2504.07956 • Published 11 days ago • 44

upvoted a paper 11 days ago

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

Paper • 2504.06232 • Published 13 days ago • 11

upvoted a paper about 1 month ago

Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7 • 118

upvoted 2 papers about 2 months ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 76

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published Feb 25 • 73

upvoted 3 papers 2 months ago

upvoted a paper 3 months ago

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

Paper • 2501.12368 • Published Jan 21 • 46

upvoted a paper 4 months ago

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published Dec 12, 2024 • 99

upvoted 3 papers 6 months ago

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

Paper • 2410.17247 • Published Oct 22, 2024 • 48

SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

Paper • 2410.16268 • Published Oct 21, 2024 • 69

BroadWay: Boost Your Text-to-Video Generation Model in a Training-free Way

Paper • 2410.06241 • Published Oct 8, 2024 • 10

liked a model 10 months ago

internlm/internlm-xcomposer2d5-7b

Visual Question Answering • Updated Jul 22, 2024 • 1.38k • 203

upvoted a paper 10 months ago

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3, 2024 • 96

liked a dataset 10 months ago

ShareGPT4Video/ShareGPT4Video

Viewer • Updated Mar 7 • 40.2k • 11.3k • 194

upvoted 2 papers 10 months ago

MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs

Paper • 2406.11833 • Published Jun 17, 2024 • 64

MotionClone: Training-Free Motion Cloning for Controllable Video Generation

Paper • 2406.05338 • Published Jun 8, 2024 • 42