VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation
Paper
β’
2502.07531
β’
Published
β’
8
Emotion Aware TTS System
Stable audio open model from Synthio paper.
PHOTOREALISTIC HUMAN RECONSTRUCTION w/ CROSS-SCALE DIFF
Audio Conditioned LipSync with Latent Diffusion Models
Add a logo to anything
Next-generation reasoning model that runs locally in-browser
Transform video frames using text instructions