Demo for Multimodal-SAE
Analyze video to describe actions and transcribe audio
Efficient 3D city generation in seconds!
Compare AI-generated videos by ability dimensions
High-resolution text-to-image generation
interact with videos !
Generate detailed descriptions from images and videos
Generate 3D models from images