Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs Paper • 2503.01743 • Published 6 days ago • 65
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Paper • 2502.14768 • Published 17 days ago • 44
Beyond Prompt Content: Enhancing LLM Performance via Content-Format Integrated Prompt Optimization Paper • 2502.04295 • Published Feb 6 • 13
Beyond Prompt Content: Enhancing LLM Performance via Content-Format Integrated Prompt Optimization Paper • 2502.04295 • Published Feb 6 • 13 • 2
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published Jan 13 • 92
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 260
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers Paper • 2408.06195 • Published Aug 12, 2024 • 70
Controllable Text Generation for Large Language Models: A Survey Paper • 2408.12599 • Published Aug 22, 2024 • 65
SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference Paper • 2303.08308 • Published Mar 15, 2023 • 1
ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices Paper • 2303.09730 • Published Mar 17, 2023 • 1
Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models Paper • 2310.05015 • Published Oct 8, 2023 • 1
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers Paper • 2408.06195 • Published Aug 12, 2024 • 70
Language Models as Black-Box Optimizers for Vision-Language Models Paper • 2309.05950 • Published Sep 12, 2023 • 4
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 256
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 717
Running on CPU Upgrade 12.7k 12.7k Open LLM Leaderboard 🏆 Track, rank and evaluate open LLMs and chatbots
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper • 2402.13753 • Published Feb 21, 2024 • 115