Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Paper • 2501.13629 • Published 1 day ago • 32
Benchmarking Chinese Knowledge Rectification in Large Language Models Paper • 2409.05806 • Published Sep 9, 2024 • 14
OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs Paper • 2409.05152 • Published Sep 8, 2024 • 31
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory Paper • 2406.12375 • Published Jun 18, 2024 • 1
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published Apr 2, 2024 • 104