Abstract
GUI agents, powered by large foundation models, can interact with digital interfaces, enabling various applications in web automation, mobile navigation, and software testing. However, their increasing autonomy has raised critical concerns about their security, privacy, and safety. This survey examines the trustworthiness of GUI agents in five critical dimensions: security vulnerabilities, reliability in dynamic environments, transparency and explainability, ethical considerations, and evaluation methodologies. We also identify major challenges such as vulnerability to adversarial attacks, cascading failure modes in sequential decision-making, and a lack of realistic evaluation benchmarks. These issues not only hinder real-world deployment but also call for comprehensive mitigation strategies beyond task success. As GUI agents become more widespread, establishing robust safety standards and responsible development practices is essential. This survey provides a foundation for advancing trustworthy GUI agents through systematic understanding and future research.
Community
This survey provides a timely and comprehensive overview of trustworthiness in GUI agents, covering critical dimensions such as security, reliability, explainability, and ethics. It fills a significant gap by shifting focus from functional performance to holistic trust, offering valuable insights and future directions for safe and responsible deployment.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- A Survey on Trustworthy LLM Agents: Threats and Countermeasures (2025)
- Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis (2025)
- Position: Standard Benchmarks Fail - LLM Agents Present Overlooked Risks for Financial Applications (2025)
- Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks (2025)
- Survey on Evaluation of LLM-based Agents (2025)
- ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning (2025)
- Building Safe GenAI Applications: An End-to-End Overview of Red Teaming for Large Language Models (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper