arxiv:2412.04455

Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection

Published on Dec 5

· Submitted by

Zhoues on Dec 6

Upvote

Authors:

Enshen Zhou ,

Cheng Chi ,

Lu Sheng ,

Abstract

Automatic detection and prevention of open-set failures are crucial in closed-loop robotic systems. Recent studies often struggle to simultaneously identify unexpected failures reactively after they occur and prevent foreseeable ones proactively. To this end, we propose Code-as-Monitor (CaM), a novel paradigm leveraging the vision-language model (VLM) for both open-set reactive and proactive failure detection. The core of our method is to formulate both tasks as a unified set of spatio-temporal constraint satisfaction problems and use VLM-generated code to evaluate them for real-time monitoring. To enhance the accuracy and efficiency of monitoring, we further introduce constraint elements that abstract constraint-related entities or their parts into compact geometric elements. This approach offers greater generality, simplifies tracking, and facilitates constraint-aware visual programming by leveraging these elements as visual prompts. Experiments show that CaM achieves a 28.7% higher success rate and reduces execution time by 31.8% under severe disturbances compared to baselines across three simulators and a real-world setting. Moreover, CaM can be integrated with open-loop control policies to form closed-loop systems, enabling long-horizon tasks in cluttered scenes with dynamic environments.

View arXiv page View PDF Add to collection

Community

Zhoues

Paper author Paper submitter 20 days ago

Project page: https://zhoues.github.io/Code-as-Monitor/

Zhoues

Paper author Paper submitter 20 days ago

🔥Code-as-Monitor🔥

We present Code-as-Monitor, a novel paradigm leveraging the VLMs for failure detection.

Highlights:

Code-as-Monitor is the first framework to integrate both reactive and proactive failure detection.
Code-as-Monitor leverages the proposed constraint elements to simplify real-time failure detection with high precision.
Code-as-Monitor achieves state-of-the-art (SOTA) performance in both simulated and real-world environments, and exhibits strong generalizability on unseen scenarios, tasks, and objects.