ShadowLLM: Predictor-based Contextual Sparsity for Large Language Models Paper • 2406.16635 • Published Jun 24, 2024
TokenButler Collection TokenButler -- Predict token importance for all heads across the transformer in the first layer itself. Enable fine-grained token sparsity! • 6 items • Updated 1 day ago • 2
TokenButler Collection TokenButler -- Predict token importance for all heads across the transformer in the first layer itself. Enable fine-grained token sparsity! • 6 items • Updated 1 day ago • 2
TokenButler Collection TokenButler -- Predict token importance for all heads across the transformer in the first layer itself. Enable fine-grained token sparsity! • 6 items • Updated 1 day ago • 2