view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency By not-lain • 11 days ago • 23
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 1 day ago • 162
Qwen2.5 Collection The Qwen 2.5 models are a series of AI models trained on 18 trillion tokens, supporting 29 languages and offering advanced features such as instructio • 33 items • Updated Oct 12, 2024 • 7
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 13 days ago • 330
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published Nov 15, 2024 • 73
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other • 17 days ago • 61
InternVL2.5-MPO Collection Enhancing the Reasoning Ability of MLLMs via Mixed Preference Optimization • 16 items • Updated 11 days ago • 26
view article Article Yay! Organizations can now publish blog Articles By huggingface and 3 others • 20 days ago • 32
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference 25 days ago • 64
view article Article MiniMax-01 is Now Open-Source: Scaling Lightning Attention for the AI Agent Era By MiniMax-AI • 25 days ago • 40