Liger: Linearizing Large Language Models to Gated Recurrent Structures Paper • 2503.01496 • Published 6 days ago • 14
MoM: Linear Sequence Modeling with Mixture-of-Memories Paper • 2502.13685 • Published 18 days ago • 33