Tsinghua University

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

fansunqi submitted a paper 19 days ago

Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task

kkakkkka submitted a paper 21 days ago

MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment

BBQGOD authored a paper about 1 month ago

ReSpec: Towards Optimizing Speculative Decoding in Reinforcement Learning Systems

View all activity

Papers

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task

View all Papers

fansunqi

submitted a paper to Daily Papers 19 days ago

Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task

Paper • 2512.10359 • Published 20 days ago • 3

kkakkkka

submitted a paper to Daily Papers 21 days ago

MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment

Paper • 2512.06628 • Published 24 days ago • 12

MasterVito

authored a paper 27 days ago

Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions

Paper • 2512.00097 • Published Nov 27 • 2

yangkaiSIGS

authored 9 papers about 1 month ago

Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners

Paper • 2509.26226 • Published Sep 30 • 33

EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control

Paper • 2511.15248 • Published Nov 19 • 6

Exploration and Anti-Exploration with Distributional Random Network Distillation

Paper • 2401.09750 • Published Jan 18, 2024

A Two-stage Reinforcement Learning-based Approach for Multi-entity Task Allocation

Paper • 2407.00496 • Published Jun 29, 2024

BATON: Aligning Text-to-Audio Model with Human Preference Feedback

Paper • 2402.00744 • Published Feb 1, 2024

Novelty-Guided Data Reuse for Efficient and Diversified Multi-Agent Reinforcement Learning

Paper • 2412.15517 • Published Dec 20, 2024

Exploration by Random Distribution Distillation

Paper • 2505.11044 • Published May 16

Novelty-based Sample Reuse for Continuous Robotics Control

Paper • 2410.13490 • Published Oct 17, 2024

CDSA: Conservative Denoising Score-based Algorithm for Offline Reinforcement Learning

Paper • 2406.07541 • Published Jun 11, 2024

MasterVito

authored a paper 3 months ago

Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR

Paper • 2509.23808 • Published Sep 28 • 47

MasterVito

authored a paper 4 months ago

Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR

Paper • 2508.14029 • Published Aug 19 • 118

DuJinHua

authored a paper 5 months ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8 • 195

Diankun

authored a paper 6 months ago

Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling

Paper • 2507.07982 • Published Jul 10 • 33

MasterVito

authored a paper 6 months ago

TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression

Paper • 2506.02678 • Published Jun 3 • 5

MasterVito

authored 2 papers 7 months ago

Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs

Paper • 2506.14245 • Published Jun 17 • 45

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Paper • 2506.08989 • Published Jun 10 • 14

zhennan1

authored a paper 7 months ago

Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent Collaboration

Paper • 2505.21471 • Published May 27 • 5