4 17 4

Shaofei Cai

phython96

https://phython96.github.io

phython96

AI & ML interests

Embodied Decision Making, Computer Vision, Game AI, Robotics

Recent Activity

upvoted a paper 6 days ago

Revisiting Multimodal Positional Encoding in Vision-Language Models

upvoted a paper 26 days ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

upvoted a paper about 1 month ago

Training-Free Group Relative Policy Optimization

View all activity

Organizations

upvoted a paper 6 days ago

Revisiting Multimodal Positional Encoding in Vision-Language Models

Paper • 2510.23095 • Published 13 days ago • 18

upvoted a paper 26 days ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published 27 days ago • 173

upvoted 3 papers about 1 month ago

Training-Free Group Relative Policy Optimization

Paper • 2510.08191 • Published about 1 month ago • 44

Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning

Paper • 2509.22601 • Published Sep 26 • 29

LongLive: Real-time Interactive Long Video Generation

Paper • 2509.22622 • Published Sep 26 • 181

authored a paper 3 months ago

Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents

Paper • 2507.23698 • Published Jul 31 • 9

commented 2 papers 3 months ago

Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents

Paper • 2507.23698 • Published Jul 31 • 9 •

Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents

Paper • 2507.23698 • Published Jul 31 • 9 •

updated a collection 3 months ago

ROCKET

Collection

ROCKET is the research series that explores vision-based goal specification methods. • 12 items • Updated Sep 21 • 2

upvoted a paper 3 months ago

Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents

Paper • 2507.23698 • Published Jul 31 • 9

commented a paper 3 months ago

Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents

Paper • 2507.23698 • Published Jul 31 • 9 •

authored 3 papers 4 months ago

upvoted a paper 4 months ago

A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

Paper • 2507.01925 • Published Jul 2 • 38

updated a collection 7 months ago

GROOT

Collection

GROOT is a research series investigating how self-supervised and weakly supervised learning can be used to train agents that follow instructions. • 3 items • Updated Aug 3 • 2

authored a paper 7 months ago

GROOT-2: Weakly Supervised Multi-Modal Instruction Following Agents

Paper • 2412.10410 • Published Dec 7, 2024

upvoted a collection 7 months ago

GROOT

Collection

GROOT is a research series investigating how self-supervised and weakly supervised learning can be used to train agents that follow instructions. • 3 items • Updated Aug 3 • 2

upvoted a paper 7 months ago

Generative Evaluation of Complex Reasoning in Large Language Models

Paper • 2504.02810 • Published Apr 3 • 14

commented a paper 7 months ago

ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment

Paper • 2503.02505 • Published Mar 4 • 7 •

Shaofei Cai

AI & ML interests

Recent Activity

Organizations

phython96's activity