Spaces:
Sleeping
Sleeping
Model,Large Language Model,Validation Split,Test Split | |
BLIP-2,Flan-T5-XL,26.71,27.90 | |
InstructBLIP,Flan-T5-XL,28.09,25.19 | |
InstructBLIP Vicuna,Vicuna-7B,26.53,26.64 | |
LLaVA,LLaMA-7B,27.0,28.16 | |
MiniGPT-4,Vicuna-7B,28.11,30.93 | |
VPGTrans,LLaMA-7B,27.38,24.12 | |
MultiModal-GPT,Vicuna-7B, 27.81,30.43 | |
Otter,LLaMA-7B,28.08,30.87 | |
OpenFlamingo,LLaMA-7B,27.67,30.18 | |
LLaMA-Adapter V2,LLaMA-7B,27.81,30.43 | |
GVT,Vicuna-7B, 27.87,29.67 | |
mPLUG-Owl,LLaMA-7B,27.63,31.31 | |
mPLUG-Owl-2,LLaMA2-7B,27.84,30.37 | |
Kosmos-2,Decoder only 1.3B,26.97,"" | |
Qwen-VL-Chat,Qwen-7B,27.69,31.06 | |
LLaVA-1.5,Vicuna-7B,27.81,29.80 | |
VideoChat,Vicuna-7B,27.51,28.72 | |
Video-ChatGPT,LLaMA-7B,27.33,29.17 | |
Valley,LLaMA-13B,27.27,30.11 | |
Video-LLaMA,LLaMA2-Chat-7B,28.58,30.30 | |
SEED-LLaMA,LLaMA2-Chat-13B,29.93,"" | |
SEED-X,LLaMA2-Chat-13B,31.07,29.92 | |
DeepSeek-VL-Chat,DeepSeek-LLM-7B,27.57,26.01 | |
CogVLM,Vicuna-7B,27.48,31.06 | |
Yi-VL,Yi-6B,28.67,30.56 | |
Xcomposer,InternLM-7B,37.17,36.36 | |
Gemini-Pro-Vision,\-,30.46,32.39 | |
GPT-4V,\-,37.98,37.25 |