Zhiyuan Ning's picture

68

Zhiyuan Ning

nzynzy

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 13 hours ago

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

upvoted a paper about 19 hours ago

MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?

upvoted a paper about 19 hours ago

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

View all activity

Organizations

None yet

nzynzy's activity

upvoted a paper about 13 hours ago

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published 3 days ago • 61

upvoted 5 papers about 19 hours ago

MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?

Paper • 2504.09702 • Published 8 days ago • 16

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published 6 days ago • 56

C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing

Paper • 2504.07964 • Published 11 days ago • 60

Self-Steering Language Models

Paper • 2504.07081 • Published 12 days ago • 17

Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?

Paper • 2504.06514 • Published 13 days ago • 38

upvoted 6 papers 4 days ago

How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients

Paper • 2504.10766 • Published 7 days ago • 38

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published 7 days ago • 81

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

Paper • 2504.08942 • Published 10 days ago • 26

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published 20 days ago • 80

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published 12 days ago • 71

JudgeLRM: Large Reasoning Models as a Judge

Paper • 2504.00050 • Published 22 days ago • 59

upvoted a paper 17 days ago

Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published 19 days ago • 52

upvoted 3 papers 19 days ago

PaperBench: Evaluating AI's Ability to Replicate AI Research

Paper • 2504.01848 • Published 19 days ago • 35

ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations

Paper • 2504.00824 • Published 20 days ago • 39

Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published 26 days ago • 43

upvoted a paper 20 days ago

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published 21 days ago • 61

upvoted a paper 22 days ago

AgentRxiv: Towards Collaborative Autonomous Research

Paper • 2503.18102 • Published 29 days ago • 21

upvoted 2 papers 28 days ago

TULIP: Towards Unified Language-Image Pretraining

Paper • 2503.15485 • Published Mar 19 • 47

Why Do Multi-Agent LLM Systems Fail?

Paper • 2503.13657 • Published Mar 17 • 43