metadata

language:
  - zh
  - en
tags:
  - internvl
  - multimodal
  - vision-language
  - food
  - finetuned
license: apache-2.0
datasets:
  - food-recognition
model-index:
  - name: InternVL2-2B-Food-Finetuned
    results:
      - task:
          type: vision-language-understanding
          name: food-recognition
        dataset:
          name: food-dataset
          type: custom
        metrics:
          - name: Accuracy
            type: accuracy
            value: 85.5
          - name: F1-Score
            type: f1
            value: 84.3

InternVL2-2B Food Recognition Finetuned Model

Model Description

这是一个基于 InternVL2-2B 模型使用 LoRA 方法在食物识别数据集上微调的多模态模型。该模型专门优化了对食物图像的理解和描述能力。

Key Features

基础模型: InternVL2-2B
微调方法: LoRA (Low-Rank Adaptation)
训练迭代: 640 iterations
特定领域: 食物识别与描述
多模态能力: 图像理解和文本生成

Training Details

Base Model

架构: InternVL2
参数量: 2B
类型: 视觉-语言多模态模型

Fine-tuning

方法: LoRA
配置文件: internvl_v2_internlm2_2b_lora_finetune_food.py
训练步数: 640
学习率: 3.5e-5
训练轮数: 10 epochs

yanyoyo
/

InternVL

InternVL2-2B Food Recognition Finetuned Model

Model Description

Key Features

Training Details

Base Model

Fine-tuning

Usage