Lily's picture

11

Lily

chenyingli

https://scholar.google.com/citations?user=iSgs5r0AAAAJ&hl=en&authuser=2

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

upvoted a paper 6 days ago

Z1: Efficient Test-time Scaling with Code

upvoted a paper 8 days ago

PHYSICS: Benchmarking Foundation Models on University-Level Physics Problem Solving

View all activity

Organizations

None yet

chenyingli's activity

upvoted 2 papers 6 days ago

Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

Paper • 2504.00906 • Published 7 days ago • 19

Z1: Efficient Test-time Scaling with Code

Paper • 2504.00810 • Published 7 days ago • 25

upvoted a paper 8 days ago

PHYSICS: Benchmarking Foundation Models on University-Level Physics Problem Solving

Paper • 2503.21821 • Published 13 days ago • 16

upvoted a paper 12 days ago

MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search

Paper • 2503.20757 • Published 13 days ago • 9

upvoted a paper 18 days ago

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published 19 days ago • 84

upvoted 3 papers 23 days ago

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published 26 days ago • 48

Charting and Navigating Hugging Face's Model Atlas

Paper • 2503.10633 • Published 26 days ago • 75

Transformers without Normalization

Paper • 2503.10622 • Published 26 days ago • 153

upvoted 2 papers about 1 month ago

IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval

Paper • 2503.04644 • Published Mar 6 • 20

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 105

upvoted a paper 3 months ago

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published Jan 21 • 86