SurveyX: Academic Survey Automation via Large Language Models Paper • 2502.14776 • Published 8 days ago • 88
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity Paper • 2502.13063 • Published 10 days ago • 62
Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation Paper • 2502.13145 • Published 10 days ago • 35
Phantom: Subject-consistent video generation via cross-modal alignment Paper • 2502.11079 • Published 12 days ago • 51
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published 16 days ago • 182
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published 16 days ago • 142
Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance Paper • 2502.08127 • Published 17 days ago • 49
TransMLA: Multi-head Latent Attention Is All You Need Paper • 2502.07864 • Published 17 days ago • 45
Retrieval-augmented Large Language Models for Financial Time Series Forecasting Paper • 2502.05878 • Published 19 days ago • 38
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction Paper • 2502.07316 • Published 18 days ago • 45
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published 21 days ago • 120
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 18 days ago • 140
On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices Paper • 2502.04363 • Published 24 days ago • 11
Can LLMs Maintain Fundamental Abilities under KV Cache Compression? Paper • 2502.01941 • Published 25 days ago • 14
The Differences Between Direct Alignment Algorithms are a Blur Paper • 2502.01237 • Published 25 days ago • 111