VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI Paper • 2410.11623 • Published Oct 15 • 46
StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models Paper • 2403.07714 • Published Mar 12 • 1
Can Vision-Language Models Think from a First-Person Perspective? Paper • 2311.15596 • Published Nov 27, 2023 • 3
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data Paper • 2309.11235 • Published Sep 20, 2023 • 16
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models Paper • 2405.15738 • Published May 24 • 43