Xinyuan-VL-2B / README.md
thomas-yanxin's picture
Update README.md
2bb3cdc verified
|
raw
history blame
1.65 kB

license: apache-2.0 language:

  • en
  • zh pipeline_tag: image-text-to-text tags:
  • multimodal library_name: transformers

We evaluated XinYuan-VL-2B using the VLMEvalKit toolkit across the following benchmarks and found that XinYuan-VL-2B outperformed Qwen/Qwen2-VL-2B-Instruct released by Alibaba Cloud, as well as other models of comparable parameter scale that have significant influence in the open-source community.

image/png

Benchamrk MiniCPM-2B InternVL-2B Qwen2-VL-2B XinYuan-VL-2B
MMB-CN-V11-Test 64.5 68.9 71.2 74.3
MMB-EN-V11-Test 65.8 70.2 73.2 76.5
MMB-EN 69.1 74.4 74.3 78.9
MMB-CN 66.5 71.2 73.8 76.12
CCBench 45.3 74.7 53.7 55.5
MMT-Bench 53.5 50.8 54.5 55.2
RealWorld 55.8 57.3 62.9 63.9
SEEDBench_IMG 67.1 70.9 72.86 73.4
AI2D 56.3 74.1 74.7 74.2
MMMU 38.2 36.3 41.1 40.9
HallusionBench 36.2 36.2 42.4 55.00
POPE 86.3 86.3 86.82 89.42
MME 1808.6 1876.8 1872.0 1854.9
MMStar 39.1 49.8 47.5 51.87
SEEDBench2_Plus 51.9 59.9 62.23 62.98
BLINK 41.2 42.8 43.92 42.98
OCRBench 605 781 794 782
TextVQA 74.1 73.4 79.7 77.6