AtsuMiyai commited on
Commit
0ca2470
1 Parent(s): ffb6f1b

update explanations on MM-UPD Bench

Browse files
Files changed (1) hide show
  1. constants.py +2 -3
constants.py CHANGED
@@ -35,8 +35,7 @@ LEADERBORAD_INTRODUCTION = """
35
  <a href='https://arxiv.org/abs/2403.20331'><img src='https://img.shields.io/badge/cs.CV-Paper-b31b1b?logo=arxiv&logoColor=red'></a>
36
  </div>
37
 
38
- ## About MM-UPD Bench
39
- ### What is MM-UPD Bench?
40
  MM-UPD Bench: A Comprehensive Benchmark for Evaluating the Trustworthiness of Vision Language Models (VLMs) in the Context of Unsolvable Problem Detection (UPD)
41
 
42
  Our MM-UPD Bench encompasses three benchmarks: MM-AAD, MM-IASD, and MM-IVQD.
@@ -54,7 +53,7 @@ MM-IVQD Bench is a dataset where the question is incompatible with the image.
54
  MM-IVQD evaluates the VLMs' capability to discern when a question and image are irrelevant or inappropriate.
55
 
56
 
57
- ### Characteristics of MM-UPD Bench
58
  We design MM-UPD Bench to provide a comprehensive evaluation of VLMs across multiple senarios.
59
 
60
  1\. **Multiple Senario Evaluation:** We carefully design prompts choices and examine the three senario: (i) Base (w/o instruction), (ii) Option (w/ additional option), (iii) Instruction (w/ additional instruction).
 
35
  <a href='https://arxiv.org/abs/2403.20331'><img src='https://img.shields.io/badge/cs.CV-Paper-b31b1b?logo=arxiv&logoColor=red'></a>
36
  </div>
37
 
38
+ ## What is MM-UPD Bench?
 
39
  MM-UPD Bench: A Comprehensive Benchmark for Evaluating the Trustworthiness of Vision Language Models (VLMs) in the Context of Unsolvable Problem Detection (UPD)
40
 
41
  Our MM-UPD Bench encompasses three benchmarks: MM-AAD, MM-IASD, and MM-IVQD.
 
53
  MM-IVQD evaluates the VLMs' capability to discern when a question and image are irrelevant or inappropriate.
54
 
55
 
56
+ ## Characteristics of MM-UPD Bench
57
  We design MM-UPD Bench to provide a comprehensive evaluation of VLMs across multiple senarios.
58
 
59
  1\. **Multiple Senario Evaluation:** We carefully design prompts choices and examine the three senario: (i) Base (w/o instruction), (ii) Option (w/ additional option), (iii) Instruction (w/ additional instruction).