Running 1.59k 1.59k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
mistralai/Mistral-Small-24B-Instruct-2501 Text Generation • Updated 24 days ago • 755k • • 819
R3GAN Collection R3GAN: A Modern BaselineGAN https://github.com/brownvc/R3GAN/ https://arxiv.org/abs/2501.05441 • 7 items • Updated Jan 10 • 10
nomic-ai/modernbert-embed-base-unsupervised Sentence Similarity • Updated Dec 30, 2024 • 362 • 10
Scaling Test-Time Compute with Open Models Collection Models and datasets used in our blog post: https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute • 10 items • Updated Jan 6 • 23
Long Context RAG Performance of Large Language Models Paper • 2411.03538 • Published Nov 5, 2024 • 1