ViCrop: Perceiving Small Visual Details in Zero-shot Visual Question Answering with Multimodal Large Language Models Paper • 2310.16033 • Published Oct 24, 2023
The Curious Case of Nonverbal Abstract Reasoning with Multi-Modal Large Language Models Paper • 2401.12117 • Published Jan 22, 2024 • 1
MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning Paper • 2404.13591 • Published Apr 21, 2024 • 2
Learn Your Tokens: Word-Pooled Tokenization for Language Modeling Paper • 2310.11628 • Published Oct 17, 2023