Breaking the Data Barrier -- Building GUI Agents Through Task Generalization Paper • 2504.10127 • Published 4 days ago • 15
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding Paper • 2504.09925 • Published 5 days ago • 36