ibm-granite/granite-embedding-278m-multilingual Sentence Similarity • Updated 17 days ago • 19.8k • 22
SmolVLM 256M & 500M Collection Collection for models & demos for even smoller SmolVLM release • 12 items • Updated 11 days ago • 64
HelpSteer2: Open-source dataset for training top-performing reward models Paper • 2406.08673 • Published Jun 12, 2024 • 18
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset Paper • 2205.12522 • Published May 25, 2022 • 2
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation Paper • 2402.03216 • Published Feb 5, 2024 • 5
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published Jan 2 • 48
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published Jan 1 • 99
Large Concept Models: Language Modeling in a Sentence Representation Space Paper • 2412.08821 • Published Dec 11, 2024 • 13