Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design Paper • 2410.19123 • Published Oct 24 • 15
Compact Language Models via Pruning and Knowledge Distillation Paper • 2407.14679 • Published Jul 19 • 38