3 8 3

Wenyue Hua

wenyueH

https://wenyueh.github.io/

AI & ML interests

LLM-based agent, LLM reasoning

Recent Activity

updated a dataset about 1 month ago

wenyueH/InductionBench

liked a dataset about 1 month ago

wenyueH/InductionBench

published a dataset about 1 month ago

wenyueH/InductionBench

View all activity

Organizations

None yet

wenyueH's activity

updated a dataset about 1 month ago

wenyueH/InductionBench

Updated about 1 month ago • 146 • 1

liked a dataset about 1 month ago

wenyueH/InductionBench

Updated about 1 month ago • 146 • 1

published a dataset about 1 month ago

wenyueH/InductionBench

Updated about 1 month ago • 146 • 1

upvoted a paper about 1 month ago

InductionBench: LLMs Fail in the Simplest Complexity Class

Paper • 2502.15823 • Published Feb 20 • 7

commented a paper about 1 month ago

InductionBench: LLMs Fail in the Simplest Complexity Class

Paper • 2502.15823 • Published Feb 20 • 7 •

authored a paper 4 months ago

RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios

Paper • 2412.08972 • Published Dec 12, 2024 • 10

upvoted a paper 4 months ago

RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios

Paper • 2412.08972 • Published Dec 12, 2024 • 10

commented a paper 4 months ago

RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios

Paper • 2412.08972 • Published Dec 12, 2024 • 10 •

upvoted a paper 4 months ago

Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions

Paper • 2412.08737 • Published Dec 11, 2024 • 53

upvoted a paper 5 months ago

Game-theoretic LLM: Agent Workflow for Negotiation Games

Paper • 2411.05990 • Published Nov 8, 2024 • 8

commented a paper 5 months ago

Game-theoretic LLM: Agent Workflow for Negotiation Games

Paper • 2411.05990 • Published Nov 8, 2024 • 8 •

published an article about 1 year ago

Article

NPHardEval Leaderboard: Unveiling the Reasoning Abilities of Large Language Models through Complexity Classes and Dynamic Updates

and 3 others •

Feb 2, 2024

• 4

upvoted a paper about 1 year ago

How to Index Item IDs for Recommendation Foundation Models

Paper • 2305.06569 • Published May 11, 2023 • 1

liked a Space about 1 year ago

NPHardEval Leaderboard

🥇

Explore and compare LLM models through a leaderboard

authored 2 papers about 1 year ago

OpenAGI: When LLM Meets Domain Experts

Paper • 2304.04370 • Published Apr 10, 2023 • 1

EntQA: Entity Linking as Question Answering

Paper • 2110.02369 • Published Oct 5, 2021

upvoted a paper about 1 year ago

The Impact of Reasoning Step Length on Large Language Models

Paper • 2401.04925 • Published Jan 10, 2024 • 18

authored a paper about 1 year ago

The Impact of Reasoning Step Length on Large Language Models

Paper • 2401.04925 • Published Jan 10, 2024 • 18

authored a paper over 1 year ago

How to Index Item IDs for Recommendation Foundation Models

Paper • 2305.06569 • Published May 11, 2023 • 1

liked a model almost 2 years ago

Angainor/alpaca-lora-13b

Updated Apr 4, 2023 • 11