Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated 2 days ago • 60
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published Apr 30, 2024 • 73