Collections
Discover the best community collections!
Collections including paper arxiv:2501.14677
-
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Paper • 2410.10306 • Published • 54 -
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Paper • 2411.05003 • Published • 70 -
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation
Paper • 2411.04709 • Published • 25 -
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Paper • 2410.07171 • Published • 42
-
Depth Anything V2
Paper • 2406.09414 • Published • 97 -
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
Paper • 2406.09415 • Published • 51 -
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion
Paper • 2406.04338 • Published • 35 -
SAM 2: Segment Anything in Images and Videos
Paper • 2408.00714 • Published • 113
-
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
Paper • 2312.04474 • Published • 31 -
Training Chain-of-Thought via Latent-Variable Inference
Paper • 2312.02179 • Published • 9 -
The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning
Paper • 2312.01552 • Published • 31 -
AppAgent: Multimodal Agents as Smartphone Users
Paper • 2312.13771 • Published • 53