SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models Paper • 2503.07605 • Published Mar 10 • 66
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13 • 193
ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning Paper • 2410.17779 • Published Oct 23, 2024 • 9
Value Residual Learning For Alleviating Attention Concentration In Transformers Paper • 2410.17897 • Published Oct 23, 2024 • 9
The Nature of Mathematical Modeling and Probabilistic Optimization Engineering in Generative AI Paper • 2410.18441 • Published Oct 24, 2024 • 7
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models Paper • 2410.18252 • Published Oct 23, 2024 • 7
Should We Really Edit Language Models? On the Evaluation of Edited Language Models Paper • 2410.18785 • Published Oct 24, 2024 • 7
ZIP-FIT: Embedding-Free Data Selection via Compression-Based Alignment Paper • 2410.18194 • Published Oct 23, 2024 • 6
Data Scaling Laws in Imitation Learning for Robotic Manipulation Paper • 2410.18647 • Published Oct 24, 2024 • 6
Pantograph: A Machine-to-Machine Interaction Interface for Advanced Theorem Proving, High Level Reasoning, and Data Extraction in Lean 4 Paper • 2410.16429 • Published Oct 21, 2024 • 5
Multi-Draft Speculative Sampling: Canonical Architectures and Theoretical Limits Paper • 2410.18234 • Published Oct 23, 2024 • 5
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance Paper • 2410.13816 • Published Oct 17, 2024 • 2
ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding Paper • 2410.13924 • Published Oct 17, 2024 • 7
TP-Eval: Tap Multimodal LLMs' Potential in Evaluation by Customizing Prompts Paper • 2410.18071 • Published Oct 23, 2024 • 7
LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias Paper • 2410.17242 • Published Oct 22, 2024 • 5