Capstone_Project / README.md
Navyabhat's picture
Update README.md
f0a49c0 verified
|
raw
history blame
1.44 kB
metadata
title: CAPSTONE-PROJECT
emoji: 🚀
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 3.35.2
app_file: app.py
pinned: false
license: mit

Phi2 : Multimodal Finetuning

Details

  1. LLM Backbone: Phi2
  2. Vision Tower: clip-vit-large-patch14-336
  3. Audio Model: Whisper
  4. Pretraining Dataset: LAION-CC-SBU dataset with BLIP captions(200k samples)
  5. Finetuning Dataset: Instruct 150k dataset based on COCO

Design

image

Pretraining

Training Loss Curve

image

Learing Rate

image

Training Logs

image

Finetuning

Training Loss Curve

image

Learing Rate

image

Training Logs

image

Results

image