Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 8 days ago • 96
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 20 days ago • 132
view article Article Introducing smolagents: simple agents that write actions in code. Dec 31, 2024 • 546
SmolVLM Collection State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct • 5 items • Updated Dec 22, 2024 • 32
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 8 days ago • 311
MARS: Unleashing the Power of Variance Reduction for Training Large Models Paper • 2411.10438 • Published Nov 15, 2024 • 13
Multimodal Autoregressive Pre-training of Large Vision Encoders Paper • 2411.14402 • Published Nov 21, 2024 • 43
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated Dec 22, 2024 • 211
Granite 3.0 Language Models Collection A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated Dec 18, 2024 • 95
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22, 2024 • 126
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated Dec 13, 2024 • 144
SigLIP Collection Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 • 10 items • Updated Dec 13, 2024 • 50