R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization Paper โข 2503.10615 โข Published 6 days ago โข 16
Neural Gaffer: Relighting Any Object via Diffusion Paper โข 2406.07520 โข Published Jun 11, 2024 โข 6
Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation Paper โข 2412.14015 โข Published Dec 18, 2024 โข 12
StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models Paper โข 2412.13188 โข Published Dec 17, 2024
stabilityai/stable-video-diffusion-img2vid-xt Image-to-Video โข Updated Jul 10, 2024 โข 664k โข 2.96k