Negative Token Merging: Image-based Adversarial Feature Guidance Paper • 2412.01339 • Published 22 days ago • 21
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published 20 days ago • 118
SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters Paper • 2412.00174 • Published 25 days ago • 22
Open-Sora Plan: Open-Source Large Video Generation Model Paper • 2412.00131 • Published 26 days ago • 32
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Paper • 2411.19146 • Published 26 days ago • 13
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model Paper • 2411.19108 • Published 26 days ago • 17
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models Paper • 2411.18613 • Published 27 days ago • 50
ShowUI: One Vision-Language-Action Model for GUI Visual Agent Paper • 2411.17465 • Published 28 days ago • 76
AnimateAnything: Consistent and Controllable Animation for Video Generation Paper • 2411.10836 • Published Nov 16 • 23
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models Paper • 2411.07126 • Published Nov 11 • 28
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality Paper • 2410.19355 • Published Oct 25 • 23
VidPanos: Generative Panoramic Videos from Casual Panning Videos Paper • 2410.13832 • Published Oct 17 • 12
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated Oct 15 • 148
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations Paper • 2410.10792 • Published Oct 14 • 29
Animate-X: Universal Character Image Animation with Enhanced Motion Representation Paper • 2410.10306 • Published Oct 14 • 54
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation Paper • 2410.07171 • Published Oct 9 • 41
TextToon: Real-Time Text Toonify Head Avatar from Single Video Paper • 2410.07160 • Published Sep 23 • 8
TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation Paper • 2410.05591 • Published Oct 8 • 13