SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space Paper • 2511.20102 • Published about 1 month ago • 26
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation Paper • 2502.21074 • Published Feb 28 • 4