Chain of Thought Empowers Transformers to Solve Inherently Serial Problems Paper • 2402.12875 • Published Feb 20 • 12
LLaMA-Omni: Seamless Speech Interaction with Large Language Models Paper • 2409.06666 • Published 25 days ago • 54
How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data Paper • 2409.03810 • Published 30 days ago • 30
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 27 items • Updated 17 days ago • 472
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B Paper • 2406.07394 • Published Jun 11 • 21
view article Article The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare Apr 19 • 102
Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On Paper • 2407.08348 • Published Jul 11 • 50
AgentInstruct: Toward Generative Teaching with Agentic Flows Paper • 2407.03502 • Published Jul 3 • 43
Load 4bit models 4x faster Collection Native bitsandbytes 4bit pre quantized models • 25 items • Updated 2 days ago • 46
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper • 2401.01335 • Published Jan 2 • 64
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 60
👑 Monarch Collection Family of 7B models that combine excellent reasoning and conversational abilities. • 7 items • Updated Aug 16 • 11
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding Paper • 2401.04398 • Published Jan 9 • 20
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment Paper • 2308.05374 • Published Aug 10, 2023 • 27
Flamingo: a Visual Language Model for Few-Shot Learning Paper • 2204.14198 • Published Apr 29, 2022 • 14
To Adapt or Not to Adapt? Real-Time Adaptation for Semantic Segmentation Paper • 2307.15063 • Published Jul 27, 2023 • 17