Apply for community grant: Company project (gpu)

#1
by adamlu1 - opened
Microsoft org

OmniParser v2 is a comprehensive method for parsing user interface screenshots into structured and easy-to-understand elements, which significantly enhances the ability of GPT-4o to generate actions that can be accurately grounded in the corresponding regions of the interface. Since the release of v1, there has been a lot of interest for v2, which reduces latency by 60% and more fine grained understanding of small icons.

We sincerely hope to demo the usage of OmniParser v2 and make it more widely available for the benefit of the opensource community.
@hysts

adamlu1 changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment