Text Generation
Transformers
Safetensors
llava_llama
Inference Endpoints
teowu commited on
Commit
31f1019
1 Parent(s): 3b6f5a7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - q-future/Co-Instruct-DB
5
+ ---
6
+ Training the [Co-Instruct-562K dataset](https://arxiv.org/abs/2402.16641) with LLaVA-1.5-7B to facilitate users that prefer the LLaVA structure.
7
+
8
+ *It is notably less accurate than the main version: https://huggingface.co/q-future/co-instruct, please refer to that checkpoint if you want a more accurate model.*
9
+
10
+ Preliminary Results:
11
+
12
+ - Q-Bench-Single-MCQ (A1, test): 73.38% (Co-Instruct-Main: 77.11%, GPT-4V-Turbo: 74.10%, Q-Instruct-LLaVA-v1.5: 67.42%, LLaVA-v1.5: 60.07%)
13
+ - Q-Bench-Pair-MCQ (A1, test): 75.88% (Co-Instruct-Main: 80.18%, GPT-4V-Turbo: 78.07%, Q-Instruct-LLaVA-v1.5: 54.50%, LLaVA-v1.5: 52.25%)
14
+
15
+ We are working on improving it in the future but we also warn that this structure (direct projection) might not be very friendly to multi-image scenarios.