Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper
โข
2502.11089
โข
Published
โข
78
We have also had good success applying tts more broadly at a diverse set of tasks in optillm - https://github.com/codelion/optillm
Great survey, if you want to play around with many of the these techniques you can do so in our open-source optimizing inference proxy optillm - https://github.com/codelion/optillm