Text Generation
Transformers
Safetensors
English
phi-llava
custom_code
Inference Endpoints
Edit model card

Model Card for LLaVa-Phi-2-3B

Model Details

Model Description

  • Developed by: LAION, SkunkworksAI & Ontocord
  • Model type: LLaVA is an open-source chatbot trained by fine-tuning Phi-2 on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture
  • Finetuned from model: Phi-2
  • License: MIT
  • Demo: llava-phi-2-3b-demo

Model Sources

Evaluation

Benchmarks

Model Parameters SQA GQA TextVQA POPE
LLaVA-1.5 7.3B 68.0 62.0 58.3 85.3
MC-LLaVA-3B 3B - 49.6 38.59 -
LLaVA-Phi 3B 68.4 - 48.6 85.0
moondream1 1.6B - 56.3 39.8 -
llava-phi-2-3b 3B 69.0 51.2 47.0 86.0

Image Captioning (MS COCO)

Model BLEU_1 BLEU_2 BLEU_3 BLEU_4 METEOR ROUGE_L CIDEr SPICE
llava-1.5-7b 75.8 59.8 45 33.3 29.4 57.7 108.8 23.5
llava-phi-2-3b 67.7 50.5 35.7 24.2 27.0 52.4 85.0 20.7
Downloads last month
118
Safetensors
Model size
2.79B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for marianna13/llava-phi-2-3b

Quantizations
1 model

Datasets used to train marianna13/llava-phi-2-3b

Spaces using marianna13/llava-phi-2-3b 2

Collection including marianna13/llava-phi-2-3b