Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2501.14677

Video collection

about 6 hours ago

MatAnyone: Stable Video Matting with Consistent Memory Propagation

Paper • 2501.14677 • Published 11 days ago • 18

about 2 hours ago

MatAnyone: Stable Video Matting with Consistent Memory Propagation

Paper • 2501.14677 • Published 11 days ago • 18

about 4 hours ago

Running on T4

2.31k

2.31k

XTTS

🐸
Running on Zero

1.06k

1.06k

FLUX.1 RealismLora

🎀

FLUX.1 RealismLora
Running on Zero

225

225

Kokoro TTS Zero

🎴

✨[With v1.0.0] Accelerated TTS on Kokoro-82M
Running on L40S

433

433

SORA 3D

🏢

Create top-quality 3D(.GLB) models from text or images

Gen AI Diffusion

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14, 2024 • 54
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7, 2024 • 70
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published Nov 5, 2024 • 25
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9, 2024 • 42

about 19 hours ago

Depth Anything V2

Paper • 2406.09414 • Published Jun 13, 2024 • 97
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 51
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion

Paper • 2406.04338 • Published Jun 6, 2024 • 35
SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 113

Interesting Papers

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

Paper • 2312.04474 • Published Dec 7, 2023 • 31
Training Chain-of-Thought via Latent-Variable Inference

Paper • 2312.02179 • Published Nov 28, 2023 • 9
The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning

Paper • 2312.01552 • Published Dec 4, 2023 • 31
AppAgent: Multimodal Agents as Smartphone Users

Paper • 2312.13771 • Published Dec 21, 2023 • 53

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs