Multilingual LLM Evaluation Collection Multilingual Evaluation Benchmarks • 8 items • Updated 13 days ago • 25
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam Paper • 2502.17055 • Published 20 days ago • 16
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published 19 days ago • 69
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference Paper • 2502.18137 • Published 19 days ago • 53
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published 28 days ago • 143
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models Paper • 2502.09604 • Published about 1 month ago • 33
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published Feb 13 • 143
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 204
Transformers Can Navigate Mazes With Multi-Step Prediction Paper • 2412.05117 • Published Dec 6, 2024 • 5
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais and 2 others • Nov 13, 2024 • 100
view article Article SauerkrautLM's Multi-Phase Spectrum Training: A Technical Deep Dive By DavidGF • Nov 9, 2024 • 9
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization Paper • 2411.02355 • Published Nov 4, 2024 • 49