66 29 33

Shenzhi Wang

shenzhi-wang

https://shenzhi-wang.netlify.app/

ShenzhiWang_THU

AI & ML interests

Large Language Model, Reinforcement Learning, and AI Agents

Recent Activity

upvoted a paper 18 days ago

Soft Adaptive Policy Optimization

upvoted a paper 2 months ago

IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance

upvoted a paper 3 months ago

Variational Reasoning for Language Models

View all activity

Organizations

authored a paper 6 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 187

authored a paper 7 months ago

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 188

authored 2 papers 8 months ago

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values

Paper • 2504.05535 • Published Apr 7 • 44

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values

Paper • 2504.05535 • Published Apr 7 • 44

authored 2 papers about 1 year ago

DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution

Paper • 2411.02359 • Published Nov 4, 2024 • 13

LLM-based Optimization of Compound AI Systems: A Survey

Paper • 2410.16392 • Published Oct 21, 2024 • 16

authored 5 papers over 1 year ago

Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing

Paper • 2407.08770 • Published Jul 11, 2024 • 21

Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning

Paper • 2310.17966 • Published Oct 27, 2023

LLM Agents for Psychology: A Study on Gamified Assessments

Paper • 2402.12326 • Published Feb 19, 2024

Hundreds Guide Millions: Adaptive Offline Reinforcement Learning with Expert Guidance

Paper • 2309.01448 • Published Sep 4, 2023

DiveR-CT: Diversity-enhanced Red Teaming with Relaxing Constraints

Paper • 2405.19026 • Published May 29, 2024 • 8

authored a paper about 2 years ago

Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation

Paper • 2310.01320 • Published Oct 2, 2023 • 9

authored a paper over 2 years ago

Boosting Offline Reinforcement Learning with Action Preference Query

Paper • 2306.03362 • Published Jun 6, 2023 • 2

Shenzhi Wang

AI & ML interests

Recent Activity

Organizations

shenzhi-wang's activity