VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation Paper • 2502.07531 • Published 4 days ago • 9
Goku: Flow Based Video Generative Foundation Models Paper • 2502.04896 • Published 8 days ago • 78
Stable Flow: Vital Layers for Training-Free Image Editing Paper • 2411.14430 • Published Nov 21, 2024 • 21
DynVFX: Augmenting Real Videos with Dynamic Content Paper • 2502.03621 • Published 10 days ago • 27
Zero-Shot Voice Cloning Collection TTS models that support zero-shot voice cloning • 7 items • Updated Oct 26, 2024 • 9
steiner-preview Collection Reasoning models trained on synthetic data using reinforcement learning. • 3 items • Updated Oct 20, 2024 • 28
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 227
Emu3 Collection Emu3: Next-Token Prediction is All You Need • 7 items • Updated 2 days ago • 68
Audio Dialogues: Dialogues dataset for audio and music understanding Paper • 2404.07616 • Published Apr 11, 2024 • 16