Large Multi-modal Models Can Interpret Features in Large Multi-modal Models Paper • 2411.14982 • Published Nov 22, 2024 • 16
Multimodal-SAE Collection The collection of the sae that hooked on llava • 4 items • Updated 7 days ago • 5
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures Paper • 2410.13754 • Published Oct 17, 2024 • 75
LLaVA-Critic Collection as a general evaluator for assessing model performance • 6 items • Updated Oct 6, 2024 • 9
LLaVA-Video Collection Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 6 items • Updated Oct 5, 2024 • 57
LLaVA-Onevision Collection LLaVa_Onevision models for single-image, multi-image, and video scenarios • 9 items • Updated Sep 18, 2024 • 12
LongVA Collection Long Context Transfer From Text To Vision: https://lmms-lab.github.io/posts/longva/ • 5 items • Updated Oct 4, 2024 • 13
LLaVA-OneVision Collection a model good at arbitrary types of visual input • 15 items • Updated Oct 5, 2024 • 20
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Paper • 2407.12772 • Published Jul 17, 2024 • 33