Papers
arxiv:2604.23772

PageGuide: Browser extension to assist users in navigating a webpage and locating information

Published on Apr 26
ยท Submitted by
TTN
on Apr 28
Authors:
,
,
,
,

Abstract

PageGuide is a browser extension that enhances AI assistant interactions by providing visual grounding of responses in web page elements, improving verification, guidance, and focus during web browsing tasks.

AI-generated summary

Users browsing the web daily struggle to quickly locate relevant information in cluttered pages, complete unfamiliar multi-step tasks, and stay focused amid distracting content. State-of-the-art AI assistants (e.g., ChatGPT, Gemini, Claude) and browser agents (e.g., OpenAI Operator, Browser Use) can answer questions and automate actions, yet they return answers without showing where the information comes from on the page, forcing users to manually verify results and blindly trust every automated steps. We present PageGuide, a browser extension that grounds LLM answers directly in the HTML DOM via visual overlays, addressing three core user needs: (a) Find-locating and highlighting relevant evidence in-situ so users can instantly verify answers on the page; (b) Guide-showing step-by-step instructions (e.g. how to change password) one at a time so users can follow and perform actions by themselves; and (c) Hide-hiding distracting content-giving users a chance to decide to hide an element or not. In a user study (N=94), PageGuide outperform unaided browsing across all modes: Hide accuracy improve by 26 percentage points (86.7% relative gain) and task completion time drops by 70%; Guide completion rate increases by 30 percentage points; and Find reduces manual search effort, with Ctrl+F usage falling by 80% and task time decreasing by 19%. Code and demo is at: pageguide.github.io.

Community

Paper submitter

Users browsing the web daily struggle to locate relevant information on cluttered pages, complete unfamiliar multi-step tasks, and stay focused amid distracting content. State-of-the-art AI assistants and browser agents return answers without showing where information comes from, forcing users to manually verify results and blindly trust every automated step.

We present ๐ŸŠ PageGuide, a browser extension that grounds LLM answers directly in the HTML DOM via visual overlays, addressing three core user needs:

Find โ€” locating and highlighting relevant evidence in-situ so users can instantly verify answers on the page;
Guide โ€” showing step-by-step instructions one at a time so users can follow and perform actions by themselves;
Hide โ€” hiding distracting content with a per-element justification and a reviewable checklist.
In a within-subject controlled user study (N = 94), PageGuide outperforms unaided browsing across all modes: Hide accuracy improves by 26 percentage points and task time drops by 70%; Guide completion rate increases by 30 percentage points; and Find reduces Ctrl+F usage by 80% and task time by 19%.

Paper submitter

We published our user study data here: https://huggingface.co/datasets/ttn0011/pageguide_userstudy

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2604.23772
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.23772 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.23772 in a Space README.md to link it from this page.

Collections including this paper 1