Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.07744

about 6 hours ago

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3 • 32
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17 • 25
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27 • 121
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17 • 21

about 4 hours ago

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

Paper • 2401.09985 • Published Jan 18 • 15
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects

Paper • 2401.09962 • Published Jan 18 • 8
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution

Paper • 2401.10404 • Published Jan 18 • 10
ActAnywhere: Subject-Aware Video Background Generation

Paper • 2401.10822 • Published Jan 19 • 13

Video Generation Control-Style Transfer

StyleMaster: Stylize Your Video with Artistic Generation and Translation

Paper • 2412.07744 • Published 15 days ago • 19
Video Motion Transfer with Diffusion Transformers

Paper • 2412.07776 • Published 15 days ago • 17
ObjCtrl-2.5D: Training-free Object Control with Camera Poses

Paper • 2412.07721 • Published 15 days ago • 8
MotionShop: Zero-Shot Motion Transfer in Video Diffusion Models with Mixture of Score Guidance

Paper • 2412.05355 • Published 19 days ago • 7

Gen AI Diffusion

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14 • 54
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7 • 70
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published Nov 5 • 25
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9 • 41

XLabs-AI/flux-RealismLora

Text-to-Image • Updated Aug 22 • 262k • • 912
StyleMaster: Stylize Your Video with Artistic Generation and Translation

Paper • 2412.07744 • Published 15 days ago • 19
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published 16 days ago • 45
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition

Paper • 2412.09501 • Published 14 days ago • 43

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22 • 89
StyleMaster: Stylize Your Video with Artistic Generation and Translation

Paper • 2412.07744 • Published 15 days ago • 19

Image-Gen StyleInject

Magic Insert: Style-Aware Drag-and-Drop

Paper • 2407.02489 • Published Jul 2 • 20
ZePo: Zero-Shot Portrait Stylization with Faster Sampling

Paper • 2408.05492 • Published Aug 10 • 7
CSGO: Content-Style Composition in Text-to-Image Generation

Paper • 2408.16766 • Published Aug 29 • 17
Style-Friendly SNR Sampler for Style-Driven Generation

Paper • 2411.14793 • Published Nov 22 • 36

MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training

Paper • 2311.17049 • Published Nov 28, 2023 • 1
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Paper • 2405.04434 • Published May 7 • 14
A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision

Paper • 2303.17376 • Published Mar 30, 2023
Sigmoid Loss for Language Image Pre-Training

Paper • 2303.15343 • Published Mar 27, 2023 • 5

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs