Update pipeline tag, add project page link, quick start and other tags

#1
by nielsr HF staff - opened
Files changed (1) hide show
  1. README.md +54 -2
README.md CHANGED
@@ -1,14 +1,18 @@
1
  ---
 
2
  library_name: transformers
3
  license: apache-2.0
4
- base_model: Qwen/Qwen2-VL-7B-Instruct
5
  tags:
6
  - llama-factory
7
  - full
8
  - generated_from_trainer
 
 
 
9
  model-index:
10
  - name: TVC-7B
11
  results: []
 
12
  ---
13
 
14
  ## Model Summary
@@ -16,10 +20,10 @@ model-index:
16
  The TVC models are 7B parameter models based on Qwen2-VL-7B-Instruct model with a context window of 8K tokens.
17
 
18
  - **Repository:** https://github.com/sun-hailong/TVC
 
19
  - **Languages:** English, Chinese
20
  - **Paper:** https://arxiv.org/abs/2503.13360
21
 
22
-
23
  ### Model Architecture
24
 
25
  - **Architecture:** Qwen2-VL-7B-Instruct
@@ -39,6 +43,54 @@ The TVC models are 7B parameter models based on Qwen2-VL-7B-Instruct model with
39
  - Datasets 3.1.0
40
  - Tokenizers 0.20.3
41
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  ## Citation
43
 
44
  ```
 
1
  ---
2
+ base_model: Qwen/Qwen2-VL-7B-Instruct
3
  library_name: transformers
4
  license: apache-2.0
 
5
  tags:
6
  - llama-factory
7
  - full
8
  - generated_from_trainer
9
+ - long-context
10
+ - reasoning
11
+ - multi-modal
12
  model-index:
13
  - name: TVC-7B
14
  results: []
15
+ pipeline_tag: image-text-to-text
16
  ---
17
 
18
  ## Model Summary
 
20
  The TVC models are 7B parameter models based on Qwen2-VL-7B-Instruct model with a context window of 8K tokens.
21
 
22
  - **Repository:** https://github.com/sun-hailong/TVC
23
+ - **Project Page:** https://sun-hailong.github.io/projects/TVC/
24
  - **Languages:** English, Chinese
25
  - **Paper:** https://arxiv.org/abs/2503.13360
26
 
 
27
  ### Model Architecture
28
 
29
  - **Architecture:** Qwen2-VL-7B-Instruct
 
43
  - Datasets 3.1.0
44
  - Tokenizers 0.20.3
45
 
46
+ ## Quick Start
47
+
48
+ ```python
49
+ from vllm import LLM, SamplingParams
50
+ from PIL import Image
51
+
52
+ model_name = "Allen8/TVC-72B"
53
+ llm = LLM(
54
+ model=model_name,
55
+ trust_remote_code=True,
56
+ tensor_parallel_size=8,
57
+ )
58
+
59
+ question = "Hint: Please answer the question requiring an integer answer and provide the final value, e.g., 1, 2, 3, at the end.
60
+ Question: Subtract all red things. Subtract all tiny matte balls. How many objects are left?
61
+ Please answer the question using a long-chain reasoning style and think step by step."
62
+ placeholder = "<|image_pad|>"
63
+ prompt = ("<|im_start|>system
64
+ You are a helpful assistant.<|im_end|>
65
+ "
66
+ f"<|im_start|>user
67
+ <|vision_start|>{placeholder}<|vision_end|>"
68
+ f"{question}<|im_end|>
69
+ "
70
+ "<|im_start|>assistant
71
+ ")
72
+
73
+ sampling_params = SamplingParams(
74
+ temperature=0.0,
75
+ top_k=1,
76
+ top_p=1.0,
77
+ stop_token_ids=[],
78
+ repetition_penalty=1.05,
79
+ max_tokens=8192
80
+ )
81
+
82
+ image = Image.open("images/case1.png")
83
+ inputs = {
84
+ "prompt": prompt,
85
+ "multi_modal_data": {
86
+ "image": image
87
+ },
88
+ }
89
+
90
+ outputs = llm.generate([inputs], sampling_params=sampling_params)
91
+ print(outputs[0].outputs[0].text)
92
+ ```
93
+
94
  ## Citation
95
 
96
  ```