NPHardEval Leaderboard: Unveiling the Reasoning Abilities of Large Language Models through Complexity Classes and Dynamic Updates Feb 2, 2024 • 3
RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios Paper • 2412.08972 • Published Dec 12, 2024 • 9
RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios Paper • 2412.08972 • Published Dec 12, 2024 • 9
RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios Paper • 2412.08972 • Published Dec 12, 2024 • 9 • 2
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper • 2412.08737 • Published Dec 11, 2024 • 52
Game-theoretic LLM: Agent Workflow for Negotiation Games Paper • 2411.05990 • Published Nov 8, 2024 • 7
Game-theoretic LLM: Agent Workflow for Negotiation Games Paper • 2411.05990 • Published Nov 8, 2024 • 7 • 2
How to Index Item IDs for Recommendation Foundation Models Paper • 2305.06569 • Published May 11, 2023 • 1
The Impact of Reasoning Step Length on Large Language Models Paper • 2401.04925 • Published Jan 10, 2024 • 16
The Impact of Reasoning Step Length on Large Language Models Paper • 2401.04925 • Published Jan 10, 2024 • 16
How to Index Item IDs for Recommendation Foundation Models Paper • 2305.06569 • Published May 11, 2023 • 1