DREAM: Dense Retrieval Embeddings via Autoregressive Modeling Paper • 2606.24667 • Published 8 days ago • 6
ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing? Paper • 2606.19531 • Published 14 days ago • 22
Qwen-RobotWorld Technical Report: Unifying Embodied World Modeling through Language-Conditioned Video Generation Paper • 2606.17030 • Published 16 days ago • 46
UniDDT: Unifying Multimodal Understanding and Generation with Decoupled Diffusion Transformer Paper • 2606.16255 • Published 16 days ago • 14