Unlocking Continual Learning Abilities in Language Models Paper โข 2406.17245 โข Published Jun 25, 2024 โข 31
Layerwise Recurrent Router for Mixture-of-Experts Paper โข 2408.06793 โข Published Aug 13, 2024 โข 33