Hierarchically Gated Recurrent Neural Network for Sequence Modeling Paper • 2311.04823 • Published Nov 8, 2023 • 2
Accelerating Toeplitz Neural Network with Constant-time Inference Complexity Paper • 2311.08756 • Published Nov 15, 2023 • 1
CO2: Efficient Distributed Training with Full Communication-Computation Overlap Paper • 2401.16265 • Published Jan 29 • 1
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention Paper • 2405.17381 • Published May 27
Scaling Image Tokenizers with Grouped Spherical Quantization Paper • 2412.02632 • Published 24 days ago • 10