-
Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration
Paper • 1802.08802 • Published -
Mapping Natural Language Commands to Web Elements
Paper • 1808.09132 • Published -
Learning to Navigate the Web
Paper • 1812.09195 • Published -
Interactive Task and Concept Learning from Natural Language Instructions and GUI Demonstrations
Paper • 1909.00031 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2401.10935
-
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Paper • 2404.05719 • Published • 62 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 36 -
CogAgent: A Visual Language Model for GUI Agents
Paper • 2312.08914 • Published • 29 -
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
Paper • 2401.10935 • Published • 4
-
Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration
Paper • 2310.00280 • Published • 3 -
Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models
Paper • 2311.09278 • Published • 7 -
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Paper • 2402.07456 • Published • 41 -
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
Paper • 2401.10935 • Published • 4
-
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
Paper • 2402.07456 • Published • 41 -
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
Paper • 2401.10935 • Published • 4 -
Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models
Paper • 2311.09278 • Published • 7 -
Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration
Paper • 2310.00280 • Published • 3
-
AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning
Paper • 2402.00769 • Published • 20 -
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module
Paper • 2311.05556 • Published • 79 -
LongAlign: A Recipe for Long Context Alignment of Large Language Models
Paper • 2401.18058 • Published • 21 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 16
-
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Paper • 2310.09478 • Published • 19 -
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams
Paper • 2310.08678 • Published • 12 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 240 -
LLaMA: Open and Efficient Foundation Language Models
Paper • 2302.13971 • Published • 13