Yang's picture

Yang

jacklanda

·

AI & ML interests

Reasoning, Mech Interp, Semantics

Recent Activity

authored a paper about 15 hours ago

Xetrieval: Mechanistically Explaining Dense Retrieval

updated a collection 3 days ago

upvoted a paper 4 days ago

Xetrieval: Mechanistically Explaining Dense Retrieval

View all activity

Organizations

authored a paper about 15 hours ago

Xetrieval: Mechanistically Explaining Dense Retrieval

Paper • 2605.29507 • Published 5 days ago • 17

updated a collection 3 days ago

Semantics

My Research work on (Lexical) Semantics. • 5 items • Updated 3 days ago

upvoted a paper 4 days ago

Xetrieval: Mechanistically Explaining Dense Retrieval

Paper • 2605.29507 • Published 5 days ago • 17

liked a dataset 8 days ago

jacklanda/SemanticQA

Updated Apr 24 • 145 • 1

commented a paper 14 days ago

Revisiting a Pain in the Neck: A Semantic Reasoning Benchmark for Language Models

Paper • 2604.16593 • Published Apr 17 • 6 •

New activity in RuleReasoner/RuleCollection-32K 25 days ago

Update README

#6 opened 25 days ago by

Update README.md

#5 opened 25 days ago by

liked a Space 26 days ago

Croissant Checker - Dev

Validate Croissant dataset files for NeurIPS submissions

updated a dataset about 1 month ago

jacklanda/SemanticQA

Updated Apr 24 • 145 • 1

published a dataset about 1 month ago

jacklanda/SemanticQA

Updated Apr 24 • 145 • 1

authored a paper about 1 month ago

Revisiting a Pain in the Neck: A Semantic Reasoning Benchmark for Language Models

Paper • 2604.16593 • Published Apr 17 • 6

updated 2 collections about 1 month ago

Semantics

My Research work on (Lexical) Semantics. • 5 items • Updated 3 days ago

Evaluations

Evals for Language Agents • 4 items • Updated Apr 21

upvoted a paper about 1 month ago

Revisiting a Pain in the Neck: A Semantic Reasoning Benchmark for Language Models

Paper • 2604.16593 • Published Apr 17 • 6

submitted a paper to Daily Papers about 1 month ago

Revisiting a Pain in the Neck: A Semantic Reasoning Benchmark for Language Models

Paper • 2604.16593 • Published Apr 17 • 6

updated a collection 2 months ago

Evaluations

Evals for Language Agents • 4 items • Updated Apr 21

updated a dataset 3 months ago

humanlaya-data-lab/OneMillion-Bench

Viewer • Updated Mar 11 • 400 • 160 • 11

commented a paper 3 months ago

\$OneMillion-Bench: How Far are Language Agents from Human Experts?

Paper • 2603.07980 • Published Mar 9 • 27 •