Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines Paper • 2410.21220 • Published 1 day ago • 5
LongReward: Improving Long-context Large Language Models with AI Feedback Paper • 2410.21252 • Published 1 day ago • 13
GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation Paper • 2410.20474 • Published 2 days ago • 11