VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping Paper • 2412.11279 • Published 10 days ago • 12
MoVA: Adapting Mixture of Vision Experts to Multimodal Context Paper • 2404.13046 • Published Apr 19 • 1
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM Paper • 2412.09618 • Published 13 days ago • 21
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM Paper • 2412.09618 • Published 13 days ago • 21
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM Paper • 2412.09618 • Published 13 days ago • 21 • 3
Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation Paper • 2412.06781 • Published 16 days ago • 18
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper • 2412.04467 • Published 20 days ago • 104