In a Training Loop 🔄

23 37 37

Rui Yang PRO

Ray2333

https://yangrui2015.github.io

YangRui2015

AI & ML interests

Deep Reinforcement Learning

Recent Activity

upvoted a paper 3 days ago

AgentSPEX: An Agent SPecification and EXecution Language

upvoted a paper 17 days ago

How Well Do Agentic Skills Work in the Wild: Benchmarking LLM Skill Usage in Realistic Settings

updated a dataset 24 days ago

GUI-Libra/GUI-Libra-81K-SFT

View all activity

Organizations

upvoted a paper 3 days ago

AgentSPEX: An Agent SPecification and EXecution Language

Paper • 2604.13346 • Published 11 days ago • 153

upvoted a paper 17 days ago

How Well Do Agentic Skills Work in the Wild: Benchmarking LLM Skill Usage in Realistic Settings

Paper • 2604.04323 • Published 19 days ago • 41

upvoted a paper about 2 months ago

GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL

Paper • 2602.22190 • Published Feb 25 • 17

upvoted a paper 2 months ago

Scalable Data Synthesis for Computer Use Agents with Step-Level Filtering

Paper • 2512.10962 • Published Nov 22, 2025 • 3

upvoted a paper 3 months ago

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

Paper • 2602.01058 • Published Feb 1 • 44

upvoted a paper 5 months ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published Nov 26, 2025 • 126

upvoted 4 papers 6 months ago

MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks

Paper • 2502.17832 • Published Feb 25, 2025 • 6

Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning

Paper • 2510.27623 • Published Oct 31, 2025 • 13

GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving

Paper • 2510.11769 • Published Oct 13, 2025 • 26

ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning

Paper • 2510.12693 • Published Oct 14, 2025 • 28

upvoted a paper 7 months ago

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Paper • 2509.25760 • Published Sep 30, 2025 • 55

upvoted a paper 8 months ago

Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

Paper • 2509.03403 • Published Sep 3, 2025 • 23

upvoted a paper 9 months ago

Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities

Paper • 2507.13158 • Published Jul 17, 2025 • 24

upvoted a paper 10 months ago

Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models

Paper • 2506.18945 • Published Jun 23, 2025 • 41

upvoted 5 papers 11 months ago

GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

Paper • 2506.03143 • Published Jun 3, 2025 • 54

AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models

Paper • 2505.22662 • Published May 28, 2025 • 6

MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning

Paper • 2505.24846 • Published May 30, 2025 • 15

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30, 2025 • 146

AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97

upvoted a paper 12 months ago

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Paper • 2505.02391 • Published May 5, 2025 • 25

Rui Yang PRO

AI & ML interests

Recent Activity

Organizations

Ray2333's activity