Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper โข 2502.05171 โข Published 24 days ago โข 121
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Paper โข 2502.02492 โข Published 27 days ago โข 58
Unifying Specialized Visual Encoders for Video Language Models Paper โข 2501.01426 โข Published Jan 2 โข 21
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems Paper โข 2407.01370 โข Published Jul 1, 2024 โข 86
Salesforce/xgen-mm-phi3-mini-instruct-r-v1 Image-Text-to-Text โข Updated 28 days ago โข 1.19k โข 184
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild Paper โข 2305.11147 โข Published May 18, 2023 โข 3