DocLLM: A layout-aware generative language model for multimodal document understanding Paper โข 2401.00908 โข Published Dec 31, 2023 โข 191
Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis Paper โข 2603.29620 โข Published 6 days ago โข 46