Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation Paper • 2502.19414 • Published 15 days ago • 18
Has My System Prompt Been Used? Large Language Model Prompt Membership Inference Paper • 2502.09974 • Published 28 days ago • 10
Has My System Prompt Been Used? Large Language Model Prompt Membership Inference Paper • 2502.09974 • Published 28 days ago • 10
Gemstones: A Model Suite for Multi-Faceted Scaling Laws Paper • 2502.06857 • Published Feb 7 • 25
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published Feb 7 • 124
Cut Your Losses in Large-Vocabulary Language Models Paper • 2411.09009 • Published Nov 13, 2024 • 47
Running 2 2 CoTaEval Leaderboard 🚀 View and filter a leaderboard of language model evaluation results
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models Paper • 2408.08872 • Published Aug 16, 2024 • 99
Goldfish Loss: Mitigating Memorization in LLMs Collection This collection contains artifacts from our paper titled: "Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs." • 9 items • Updated Oct 31, 2024 • 3