hexuan21 commited on
Commit
9210115
·
verified ·
1 Parent(s): a965f0f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -22
README.md CHANGED
@@ -41,32 +41,32 @@ with real videos excluded.
41
  The evaluation results are shown below:
42
 
43
 
44
- | metric | Final Sum Score | VideoFeedback-test | EvalCrafter | GenAI-Bench | VBench |
45
- |:-----------------:|:---------------:|:--------------:|:-----------:|:-----------:|:----------:|
46
- | MantisScore (reg) | **278.3** | 75.7 | **51.1** | **78.5** | **73.0** |
47
- | MantisScore (gen) | 222.4 | **77.1** | 27.6 | 59.0 | 58.7 |
48
- | Gemini-1.5-Pro | <u>158.8</u> | 22.1 | 22.9 | 60.9 | 52.9 |
49
- | Gemini-1.5-Flash | 157.5 | 20.8 | 17.3 | <u>67.1</u> | 52.3 |
50
- | GPT-4o | 155.4 | <u>23.1</u> | 28.7 | 52.0 | 51.7 |
51
- | CLIP-sim | 126.8 | 8.9 | <u>36.2</u> | 34.2 | 47.4 |
52
- | DINO-sim | 121.3 | 7.5 | 32.1 | 38.5 | 43.3 |
53
- | SSIM-sim | 118.0 | 13.4 | 26.9 | 34.1 | 43.5 |
54
- | CLIP-Score | 114.4 | -7.2 | 21.7 | 45.0 | 54.9 |
55
- | LLaVA-1.5-7B | 108.3 | 8.5 | 10.5 | 49.9 | 39.4 |
56
- | LLaVA-1.6-7B | 93.3 | -3.1 | 13.2 | 44.5 | 38.7 |
57
- | X-CLIP-Score | 92.9 | -1.9 | 13.3 | 41.4 | 40.1 |
58
- | PIQE | 78.3 | -10.1 | -1.2 | 34.5 |<u> 55.1</u>|
59
- | BRISQUE | 75.9 | -20.3 | 3.9 | 38.5 | 53.7 |
60
- | Idefics2 | 73.0 | 6.5 | 0.3 | 34.6 | 31.7 |
61
- | SSIM-dyn | 42.5 | -5.5 | -17.0 | 28.4 | 36.5 |
62
- | MES-dyn | 36.7 | -12.9 | -26.4 | 31.4 | 44.5 |
63
- | Fuyu | - | - | - | - | - |
64
  | Kosmos-2 | - | - | - | - | - |
65
  | CogVLM | - | - | - | - | - |
66
- | OpenFlamingo | - | - | - | - | - |
67
 
68
  The best in MantisScore series is in bold and the best in baselines is underlined.
69
- "-" means the answer of MLLM is meaningless or in wrong format.
70
 
71
  ## Usage
72
  ### Installation
 
41
  The evaluation results are shown below:
42
 
43
 
44
+ | metric | Final Avg Score | VideoFeedback-test | EvalCrafter | GenAI-Bench | VBench |
45
+ |:-----------------:|:--------------:|:--------------:|:-----------:|:-----------:|:----------:|
46
+ | MantisScore (reg) | **69.6** | 75.7 | **51.1** | **78.5** | **73.0** |
47
+ | MantisScore (gen) | 55.6 | **77.1** | 27.6 | 59.0 | 58.7 |
48
+ | Gemini-1.5-Pro | <u>39.7</u> | 22.1 | 22.9 | 60.9 | 52.9 |
49
+ | Gemini-1.5-Flash | 39.4 | 20.8 | 17.3 | <u>67.1</u> | 52.3 |
50
+ | GPT-4o | 38.9 | <u>23.1</u> | 28.7 | 52.0 | 51.7 |
51
+ | CLIP-sim | 31.7 | 8.9 | <u>36.2</u> | 34.2 | 47.4 |
52
+ | DINO-sim | 30.3 | 7.5 | 32.1 | 38.5 | 43.3 |
53
+ | SSIM-sim | 29.5 | 13.4 | 26.9 | 34.1 | 43.5 |
54
+ | CLIP-Score | 28.6 | -7.2 | 21.7 | 45.0 | 54.9 |
55
+ | LLaVA-1.5-7B | 27.1 | 8.5 | 10.5 | 49.9 | 39.4 |
56
+ | LLaVA-1.6-7B | 23.3 | -3.1 | 13.2 | 44.5 | 38.7 |
57
+ | X-CLIP-Score | 23.2 | -1.9 | 13.3 | 41.4 | 40.1 |
58
+ | PIQE | 19.6 | -10.1 | -1.2 | 34.5 |<u> 55.1</u>|
59
+ | BRISQUE | 19.0 | -20.3 | 3.9 | 38.5 | 53.7 |
60
+ | Idefics2 | 18.3 | 6.5 | 0.3 | 34.6 | 31.7 |
61
+ | SSIM-dyn | 10.6 | -5.5 | -17.0 | 28.4 | 36.5 |
62
+ | MES-dyn | 9.2 | -12.9 | -26.4 | 31.4 | 44.5 |
63
+ <!-- | Fuyu | - | - | - | - | - |
64
  | Kosmos-2 | - | - | - | - | - |
65
  | CogVLM | - | - | - | - | - |
66
+ | OpenFlamingo | - | - | - | - | - | -->
67
 
68
  The best in MantisScore series is in bold and the best in baselines is underlined.
69
+ <!-- "-" means the answer of MLLM is meaningless or in wrong format. -->
70
 
71
  ## Usage
72
  ### Installation