One Graph Model for Cross-domain Dynamic Link Prediction Paper • 2402.02168 • Published Feb 3, 2024 • 1
PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding Paper • 2501.16411 • Published 8 days ago • 17
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Paper • 2501.03847 • Published 28 days ago • 23
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos Paper • 2501.04001 • Published 28 days ago • 42
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation Paper • 2412.07589 • Published Dec 10, 2024 • 45
Research Paper Collection Research Papers from Researcher/Member of MeissonFlow. • 1 item • Updated Dec 11, 2024
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing Paper • 2412.04280 • Published Dec 5, 2024 • 13
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea Paper • 2411.15738 • Published Nov 24, 2024 • 1
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing Paper • 2412.04280 • Published Dec 5, 2024 • 13
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing Paper • 2412.04280 • Published Dec 5, 2024 • 13
RelationBooth: Towards Relation-Aware Customized Object Generation Paper • 2410.23280 • Published Oct 30, 2024
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models Paper • 2410.13370 • Published Oct 17, 2024 • 36
LaT: Latent Translation with Cycle-Consistency for Video-Text Retrieval Paper • 2207.04858 • Published Jul 11, 2022
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis Paper • 2410.08261 • Published Oct 10, 2024 • 50
Generalizable Entity Grounding via Assistance of Large Language Model Paper • 2402.02555 • Published Feb 4, 2024