Instella ✨ Collection Announcing Instella, a series of 3 billion parameter language models developed by AMD, trained from scratch on 128 Instinct MI300X GPUs. • 5 items • Updated about 13 hours ago • 3
Hallucination detection Collection Trained ModernBERT (base and large) for detection hallucinations in LLM responses. The models are trained as token classifications. • 4 items • Updated about 22 hours ago • 14
LettuceDetect: A Hallucination Detection Framework for RAG Applications Paper • 2502.17125 • Published 10 days ago • 7
Rank1: Test-Time Compute for Reranking in Information Retrieval Paper • 2502.18418 • Published 9 days ago • 24
DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers Paper • 2502.18460 • Published 9 days ago • 1
rank1 Collection rank1 is the first test-time compute reasoning model in IR • 15 items • Updated 7 days ago • 3
The Ultimate Collection of Code Classifiers Collection 🔥 15 classifiers, 124M parameters, one per programming language— for assessing the educational value of GitHub code • 15 items • Updated 14 days ago • 10
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥 17 days ago • 93
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub 23 days ago • 49
view article Article From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages By Steveeeeeeen and 1 other • 23 days ago • 26
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 30 days ago • 198