What matters for Representation Alignment: Global Information or Spatial Structure? Paper • 2512.10794 • Published 16 days ago • 8
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management Paper • 2512.12967 • Published 12 days ago • 100
Towards Scalable Pre-training of Visual Tokenizers for Generation Paper • 2512.13687 • Published 11 days ago • 95
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published 23 days ago • 168
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published 29 days ago • 213
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models Paper • 2512.02014 • Published 25 days ago • 68
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published Nov 12 • 68