Lily

chenyingli

https://scholar.google.com/citations?user=iSgs5r0AAAAJ&hl=en&authuser=2

AI & ML interests

None yet

Recent Activity

upvoted a paper 21 days ago

Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

upvoted a paper 21 days ago

Z1: Efficient Test-time Scaling with Code

upvoted a paper 23 days ago

PHYSICS: Benchmarking Foundation Models on University-Level Physics Problem Solving

View all activity

Organizations

None yet

chenyingli's activity

upvoted 2 papers 21 days ago

Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

Paper • 2504.00906 • Published 21 days ago • 21

Z1: Efficient Test-time Scaling with Code

Paper • 2504.00810 • Published 21 days ago • 26

upvoted a paper 23 days ago

PHYSICS: Benchmarking Foundation Models on University-Level Physics Problem Solving

Paper • 2503.21821 • Published 28 days ago • 17

upvoted a paper 27 days ago

MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search

Paper • 2503.20757 • Published 27 days ago • 10

upvoted a paper about 1 month ago

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 88

updated a Space about 1 month ago

Paper

🌖

Retrieve and compile recent research papers from Google Scholar profiles

upvoted 3 papers about 1 month ago

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13 • 50

Charting and Navigating Hugging Face's Model Atlas

Paper • 2503.10633 • Published Mar 13 • 77

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 159

published a Space about 1 month ago

Paper

🌖

Retrieve and compile recent research papers from Google Scholar profiles

upvoted 2 papers about 2 months ago

IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval

Paper • 2503.04644 • Published Mar 6 • 21

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 109

upvoted a paper 3 months ago

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published Jan 21 • 86