VideoChat2-TPO

This model is based on the paper Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment.

πŸƒ Installation

pip install -r requirements.txt
python app.py

πŸ”§ Usage

from transformers import AutoModel, AutoTokenizer
from tokenizer import MultimodalLlamaTokenizer

model_path = "OpenGVLab/VideoChat-TPO"
tokenizer =  AutoTokenizer.from_pretrained(model_path,
trust_remote_code=True,
use_fast=False,)
model = AutoModel.from_pretrained(model_path,  trust_remote_code=True, _tokenizer=self.tokenizer).eval()
Downloads last month
32
Safetensors
Model size
8.1B params
Tensor type
I64
Β·
BF16
Β·
Inference API
Inference API (serverless) does not yet support model repos that contain custom code.

Model tree for OpenGVLab/VideoChat-TPO

Finetuned
(911)
this model