lastdefiance20 commited on
Commit
85c8e9f
Β·
verified Β·
1 Parent(s): dde3f7b

Update content.py

Browse files
Files changed (1) hide show
  1. content.py +17 -6
content.py CHANGED
@@ -13,9 +13,7 @@ Bottom_logo = f'''<img src="data:image/jpeg;base64,{bottom_logo}" style="width:2
13
  intro_md = f'''
14
  # {benchname} Leaderboard
15
 
16
- * [πŸ“Š Dataset](https://huggingface.co/datasets/maum-ai/KOFFVQA_Data)
17
- * [πŸ§ͺ Evaluation Code](https://github.com/maum-ai/KOFFVQA)
18
- * [πŸ“„ Report](https://arxiv.org/abs/2503.23730)
19
 
20
  {benchname}πŸ” is a Free-Form VQA benchmark dataset designed to evaluate Vision-Language Models (VLMs) in Korean language environments. Unlike traditional multiple-choice or predefined answer formats, KOFFVQA challenges models to generate open-ended, natural-language answers to visually grounded questions. This allows for a more comprehensive assessment of a model's ability to understand and generate nuanced Korean responses.
21
 
@@ -34,22 +32,35 @@ The {benchname} benchmark is designed to evaluate and compare the performance of
34
  This benchmark includes a total of 275 Korean questions across 10 tasks. The questions are open-ended, free-form VQA (Visual Question Answering) with objective answers, allowing responses without strict format constraints.
35
 
36
  ## News
 
37
 
38
- * **2025-04-01** : Our paper [KOFFVQA: An Objectively Evaluated Free-form VQA Benchmark for Large Vision-Language Models in the Korean Language](https://arxiv.org/abs/2503.23730) has released and accepted to CVPRW 2025, Workshop on Benchmarking and Expanding AI Multimodal Approaches(BEAM 2025) πŸŽ‰
39
 
40
  * **2025-01-21**: [Evaluation code](https://github.com/maum-ai/KOFFVQA) and [dataset](https://huggingface.co/datasets/maum-ai/KOFFVQA_Data) release
41
 
42
  * **2024-12-06**: Leaderboard Release!
43
 
 
 
 
 
 
 
 
 
 
 
 
 
44
  '''.strip()
45
 
46
  submit_md = f'''
47
 
48
- # Submit (coming soon)
49
 
50
  We are not accepting model addition requests at the moment. Once the request system is established, we will start accepting requests.
51
 
52
- πŸš€ Curious how your VLM performs in Korean? Use our [Evaluation code](https://github.com/maum-ai/KOFFVQA) to run it on KOFFVQA and check the score.
53
 
54
  πŸ§‘β€βš–οΈ We currently use google/gemma-2-9b-it as the judge model, so there's no need to worry about API keys or usage fees.
55
 
 
13
  intro_md = f'''
14
  # {benchname} Leaderboard
15
 
16
+ [**πŸ† Leaderboard**](https://huggingface.co/spaces/maum-ai/KOFFVQA-Leaderboard) | [**πŸ“„ KOFFVQA Arxiv**](https://arxiv.org/abs/2503.23730) | [**πŸ€— KOFFVQA Dataset**](https://huggingface.co/datasets/maum-ai/KOFFVQA_Data)
 
 
17
 
18
  {benchname}πŸ” is a Free-Form VQA benchmark dataset designed to evaluate Vision-Language Models (VLMs) in Korean language environments. Unlike traditional multiple-choice or predefined answer formats, KOFFVQA challenges models to generate open-ended, natural-language answers to visually grounded questions. This allows for a more comprehensive assessment of a model's ability to understand and generate nuanced Korean responses.
19
 
 
32
  This benchmark includes a total of 275 Korean questions across 10 tasks. The questions are open-ended, free-form VQA (Visual Question Answering) with objective answers, allowing responses without strict format constraints.
33
 
34
  ## News
35
+ * **2025-04-25** : Our [leaderboard](https://huggingface.co/spaces/maum-ai/KOFFVQA-Leaderboard) currently finished evaluating total **81** of famous vlm around open- or close- sourced model. Also refactoring the evaluation code to make it easier to use and evaluate much more diverse models.
36
 
37
+ * **2025-04-01** : Our paper [KOFFVQA: An Objectively Evaluated Free-form VQA Benchmark for Large Vision-Language Models in the Korean Language](https://arxiv.org/abs/2503.23730) has been released and accepted to CVPRW 2025, Workshop on Benchmarking and Expanding AI Multimodal Approaches(BEAM 2025) πŸŽ‰
38
 
39
  * **2025-01-21**: [Evaluation code](https://github.com/maum-ai/KOFFVQA) and [dataset](https://huggingface.co/datasets/maum-ai/KOFFVQA_Data) release
40
 
41
  * **2024-12-06**: Leaderboard Release!
42
 
43
+ ## Citation
44
+
45
+ **BibTeX:**
46
+ ```bibtex
47
+ @article{kim2025koffvqa,
48
+ title={KOFFVQA: An Objectively Evaluated Free-form VQA Benchmark for Large Vision-Language Models in the Korean Language},
49
+ author={Kim, Yoonshik and Jung, Jaeyoon},
50
+ journal={arXiv preprint arXiv:2503.23730},
51
+ year={2025}
52
+ }
53
+ ```
54
+
55
  '''.strip()
56
 
57
  submit_md = f'''
58
 
59
+ # Submit
60
 
61
  We are not accepting model addition requests at the moment. Once the request system is established, we will start accepting requests.
62
 
63
+ πŸš€ Wondering how your VLM stacks up in Korean? Just run it with our evaluation code and get your scoreβ€”no API key needed!
64
 
65
  πŸ§‘β€βš–οΈ We currently use google/gemma-2-9b-it as the judge model, so there's no need to worry about API keys or usage fees.
66