xjtupanda
/

MiniCPM-V-200K-video-finetune

Video-Text-to-Text

feature-extraction

Model card Files Files and versions Community

T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs

💻 GitHub | 📑 Paper

Model Summary

This is a part of the project T2Vid.
The video-LLM is fine-tuned from the image-LLM MiniCPM-Llama3-V-2_5.

License

Model License

The code in this repo is released under the Apache-2.0 License.
The usage of MiniCPM-V series model weights must strictly follow MiniCPM Model License.md.
The models and weights of MiniCPM are completely free for academic research. After filling out a "questionnaire" for registration, are also available for free commercial use.

Statement

As an LLM, MiniCPM-Llama3-V 2.5 generates contents by learning a large mount of texts, but it cannot comprehend, express personal opinions or make value judgement. Anything generated by MiniCPM-Llama3-V 2.5 does not represent the views and positions of the model developers
We will not be liable for any problems arising from the use of the MinCPM-V open Source model, including but not limited to data security issues, risk of public opinion, or any risks and problems arising from the misdirection, misuse, dissemination or misuse of the model.

Training dataset

100K video instruction data from Video-ChatGPT
100K video caption data from ShareGemini

Downloads last month: 53

Safetensors

Model size

8.54B params

Tensor type

BF16

·

Inference Providers NEW

Video-Text-to-Text

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The HF Inference API does not support model that require custom code execution.

Model tree for xjtupanda/MiniCPM-V-200K-video-finetune

Base model

openbmb/MiniCPM-Llama3-V-2_5

Finetuned

(7)

this model

Datasets used to train xjtupanda/MiniCPM-V-200K-video-finetune

Collection including xjtupanda/MiniCPM-V-200K-video-finetune

T2Vid

T2Vid is a data augmentation method that enriches the instruction diversity of video data. In this collection, you will find related data and weights. • 5 items • Updated Nov 28, 2024