SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20 • 139
On Domain-Specific Post-Training for Multimodal Large Language Models Paper • 2411.19930 • Published Nov 29, 2024 • 28
SmolVLM 256M & 500M Collection Collection for models & demos for even smoller SmolVLM release • 12 items • Updated Feb 20 • 72
Multilingual LLM Evaluation Collection Multilingual Evaluation Benchmarks • 8 items • Updated Mar 3 • 25
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 226
LLMs for Extremely Low-Resource Finno-Ugric Languages Paper • 2410.18902 • Published Oct 24, 2024 • 3
MaLA-LM Collection MaLA-LM: Massive Language Adaptation of Large Language Models • 7 items • Updated Oct 7, 2024 • 1
4M Models Collection Multimodal models from https://4m.epfl.ch/ • 17 items • Updated 29 days ago • 31
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 74
LLM2CLIP Collection LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. • 11 items • Updated 24 days ago • 59
GLiClass Collection Generalist and Light-weighted Models for Zero-shot Text Classification • 13 items • Updated Sep 17, 2024 • 14