AQLM+PV Collection Official AQLM quantizations for "PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression": https://arxiv.org/abs/2405.14852 • 26 items • Updated Feb 28 • 21
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs Paper • 2502.14837 • Published Feb 20 • 3
Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication Paper • 2402.18439 • Published Feb 28, 2024 • 1
OneLLM: One Framework to Align All Modalities with Language Paper • 2312.03700 • Published Dec 6, 2023 • 24
view article Article Optimum-NVIDIA - Unlock blazingly fast LLM inference in just 1 line of code Dec 5, 2023 • 5
NVLM 1.0 Collection A family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks and text-only tasks. • 2 items • Updated about 20 hours ago • 51