AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models Paper • 2509.12019 • Published Sep 15, 2025 • 2
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference Paper • 2502.18137 • Published Feb 25, 2025 • 60