HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration Paper • 2504.03536 • Published 18 days ago • 12
Scaling Analysis of Interleaved Speech-Text Language Models Paper • 2504.02398 • Published 19 days ago • 27
SkyReels-A2: Compose Anything in Video Diffusion Transformers Paper • 2504.02436 • Published 19 days ago • 35
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models Paper • 2504.04718 • Published 15 days ago • 39
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published 15 days ago • 167
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models Paper • 2503.24235 • Published 22 days ago • 53
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Paper • 2503.24376 • Published 22 days ago • 38
Unicorn: Text-Only Data Synthesis for Vision Language Model Training Paper • 2503.22655 • Published 25 days ago • 38
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance Paper • 2504.01724 • Published 20 days ago • 64
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Paper • 2504.00999 • Published 21 days ago • 83
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback Paper • 2503.22230 • Published 25 days ago • 43
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction Paper • 2504.01014 • Published 21 days ago • 63
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes Paper • 2503.23461 • Published 23 days ago • 94