Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation Paper • 2412.04432 • Published 20 days ago • 14
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation Paper • 2412.04432 • Published 20 days ago • 14
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation Paper • 2412.04432 • Published 20 days ago • 14 • 2
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation Paper • 2412.04445 • Published 20 days ago • 21
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation Paper • 2412.04445 • Published 20 days ago • 21
SEED-Story: Multimodal Long Story Generation with Large Language Model Paper • 2407.08683 • Published Jul 11 • 22