Video-Text-to-Text
Transformers
Safetensors
English
llava
text-generation
multimodal
Eval Results
Inference Endpoints
ZhangYuanhan commited on
Commit
ecc105e
1 Parent(s): 9010a84

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -2
README.md CHANGED
@@ -1,6 +1,7 @@
1
  ---
2
  datasets:
3
  - lmms-lab/LLaVA-NeXT-Video-SFT-Data
 
4
  language:
5
  - en
6
  library_name: transformers
@@ -112,6 +113,8 @@ model-index:
112
  value: 63.3
113
  name: accuracy
114
  verified: true
 
 
115
  ---
116
 
117
  # LLaVA-Video-7B-Qwen2
@@ -127,7 +130,7 @@ model-index:
127
 
128
  ## Model Summary
129
 
130
- The LLaVA-OneVision models are 7/72B parameter models trained on [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data), based on Qwen2 language model with a context window of 32K tokens.
131
 
132
  - **Repository:** [LLaVA-VL/LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT?tab=readme-ov-file)
133
  - **Point of Contact:** [Yuanhan Zhang](https://zhangyuanhan-ai.github.io/)
@@ -179,7 +182,7 @@ def load_video(self, video_path, max_frames_num,fps=1,force_sample=False):
179
  spare_frames = vr.get_batch(frame_idx).asnumpy()
180
  # import pdb;pdb.set_trace()
181
  return spare_frames,frame_time,video_time
182
- pretrained = "lmms-lab/LLaVA-NeXT-Video-7B-Qwen2"
183
  model_name = "llava_qwen"
184
  device = "cuda"
185
  device_map = "auto"
 
1
  ---
2
  datasets:
3
  - lmms-lab/LLaVA-NeXT-Video-SFT-Data
4
+ - lmms-lab/LLaVA-OneVision-Data
5
  language:
6
  - en
7
  library_name: transformers
 
113
  value: 63.3
114
  name: accuracy
115
  verified: true
116
+ base_model:
117
+ - lmms-lab/llava-onevision-qwen2-7b-si
118
  ---
119
 
120
  # LLaVA-Video-7B-Qwen2
 
130
 
131
  ## Model Summary
132
 
133
+ The LLaVA-Video models are 7/72B parameter models trained on [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data), based on Qwen2 language model with a context window of 32K tokens.
134
 
135
  - **Repository:** [LLaVA-VL/LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT?tab=readme-ov-file)
136
  - **Point of Contact:** [Yuanhan Zhang](https://zhangyuanhan-ai.github.io/)
 
182
  spare_frames = vr.get_batch(frame_idx).asnumpy()
183
  # import pdb;pdb.set_trace()
184
  return spare_frames,frame_time,video_time
185
+ pretrained = "lmms-lab/LLaVA-Video-7B-Qwen2"
186
  model_name = "llava_qwen"
187
  device = "cuda"
188
  device_map = "auto"