hexuan21 commited on
Commit
daffd51
·
verified ·
1 Parent(s): 835f205

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -34,7 +34,7 @@ averaged among all the evaluation aspects as indicator.
34
  For GenAI-Bench and VBench, which include human preference data among two or more videos,
35
  we employ the model's output to predict preferences and use pairwise accuracy as the performance indicator.
36
  | metric | Final Sum Score | VideoEval-test | EvalCrafter | GenAI-Bench | VBench |
37
- |-------------------|:---------------:|:--------------:|:-----------:|:-----------:|:----------:|
38
  | MantisScore (reg) | **278.3** | 75.7 | **51.1** | **78.5** | **73.0** |
39
  | MantisScore (gen) | 222.4 | **77.1** | 27.6 | 59.0 | 58.7 |
40
  | Gemini-1.5-Pro | <u>158.8</u> | 22.1 | 22.9 | 60.9 | 52.9 |
@@ -56,6 +56,7 @@ we employ the model's output to predict preferences and use pairwise accuracy as
56
  | Kosmos-2 | - | - | - | - | - |
57
  | CogVLM | - | - | - | - | - |
58
  | OpenFlamingo | - | - | - | - | - |
 
59
  The best in MantisScore series is in bold and the best in baselines is underlined.
60
  "-" means the answer of MLLM is meaningless or in wrong format.
61
 
 
34
  For GenAI-Bench and VBench, which include human preference data among two or more videos,
35
  we employ the model's output to predict preferences and use pairwise accuracy as the performance indicator.
36
  | metric | Final Sum Score | VideoEval-test | EvalCrafter | GenAI-Bench | VBench |
37
+ |:-----------------:|:---------------:|:--------------:|:-----------:|:-----------:|:----------:|
38
  | MantisScore (reg) | **278.3** | 75.7 | **51.1** | **78.5** | **73.0** |
39
  | MantisScore (gen) | 222.4 | **77.1** | 27.6 | 59.0 | 58.7 |
40
  | Gemini-1.5-Pro | <u>158.8</u> | 22.1 | 22.9 | 60.9 | 52.9 |
 
56
  | Kosmos-2 | - | - | - | - | - |
57
  | CogVLM | - | - | - | - | - |
58
  | OpenFlamingo | - | - | - | - | - |
59
+
60
  The best in MantisScore series is in bold and the best in baselines is underlined.
61
  "-" means the answer of MLLM is meaningless or in wrong format.
62