GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models Paper • 2410.06154 • Published Oct 8, 2024 • 16
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content Paper • 2410.10783 • Published Oct 14, 2024 • 26
MATE: Masked Autoencoders are Online 3D Test-Time Learners Paper • 2211.11432 • Published Nov 21, 2022
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge Paper • 2303.08914 • Published Mar 15, 2023