COSMOS: Predictable and Cost-Effective Adaptation of LLMs Paper • 2505.01449 • Published Apr 30, 2025 • 4
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models Paper • 2403.07384 • Published Mar 12, 2024 • 3
Less is More: Improving LLM Alignment via Preference Data Selection Paper • 2502.14560 • Published Feb 20, 2025 • 1
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning Paper • 2312.15685 • Published Dec 25, 2023 • 17
GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation Paper • 2603.26661 • Published 8 days ago • 18
Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published 3 days ago • 24
A Neuroscience-Inspired Dual-Process Model of Compositional Generalization Paper • 2507.18868 • Published Jul 25, 2025 • 1
LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers Paper • 2310.15164 • Published Oct 23, 2023 • 4
JEPA-Reasoner: Decoupling Latent Reasoning from Token Generation Paper • 2512.19171 • Published Dec 22, 2025 • 1
Disentangling Reasoning Capabilities from Language Models with Compositional Reasoning Transformers Paper • 2210.11265 • Published Oct 20, 2022 • 1
The Dual-Stream Transformer: Channelized Architecture for Interpretable Language Modeling Paper • 2603.07461 • Published 28 days ago • 1
Interpretable-by-Design Transformers via Architectural Stream Independence Paper • 2603.07482 • Published 28 days ago • 1
Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization Paper • 2506.13331 • Published Jun 16, 2025 • 1
When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners Paper • 2505.15257 • Published May 21, 2025 • 1
Are formal and functional linguistic mechanisms dissociated in language models? Paper • 2503.11302 • Published Mar 14, 2025 • 1
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models Paper • 2601.07372 • Published Jan 12 • 47