Update README.md
Browse files
README.md
CHANGED
@@ -9,7 +9,7 @@ pipeline_tag: image-text-to-text
|
|
9 |
|
10 |
## Model Overview
|
11 |
|
12 |
-
This is a multimodal large language model fine-tuned from Qwen2.5-VL on the **R1-Onevision** dataset. The model enhances vision-language understanding and reasoning capabilities, making it suitable for various tasks such as visual reasoning, image understanding. With its robust ability to perform multimodal reasoning, R1-
|
13 |
|
14 |
## Training Configuration and Curve
|
15 |
- Framework: The training process uses the open-source **LLama-Factory** library, with **Qwen2.5-VL-Instruct** as the base model. This model comes in three variants: 3B, 7B, and 32B.
|
@@ -29,6 +29,9 @@ bf16: true
|
|
29 |
flash_attn: fa2
|
30 |
```
|
31 |
|
|
|
|
|
|
|
32 |
## Usage
|
33 |
|
34 |
You can load the model using the Hugging Face `transformers` library:
|
@@ -37,7 +40,7 @@ You can load the model using the Hugging Face `transformers` library:
|
|
37 |
from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration
|
38 |
import torch
|
39 |
|
40 |
-
MODEL_ID = "Fancy-MLLM/R1-
|
41 |
processor = AutoProcessor.from_pretrained(MODEL_ID, trust_remote_code=True)
|
42 |
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
|
43 |
MODEL_ID,
|
@@ -79,9 +82,6 @@ output_text = processor.batch_decode(
|
|
79 |
print(output_text)
|
80 |
```
|
81 |
|
82 |
-
Training loss curve:
|
83 |
-
<img src=""/>
|
84 |
-
|
85 |
## Ongoing Work
|
86 |
1. **Rule-Based Reinforcement Learning (RL)**
|
87 |
|
|
|
9 |
|
10 |
## Model Overview
|
11 |
|
12 |
+
This is a multimodal large language model fine-tuned from Qwen2.5-VL on the **R1-Onevision** dataset. The model enhances vision-language understanding and reasoning capabilities, making it suitable for various tasks such as visual reasoning, image understanding. With its robust ability to perform multimodal reasoning, R1-Onevision emerges as a powerful AI assistant capable of addressing a wide range of problem-solving challenges across different domains.
|
13 |
|
14 |
## Training Configuration and Curve
|
15 |
- Framework: The training process uses the open-source **LLama-Factory** library, with **Qwen2.5-VL-Instruct** as the base model. This model comes in three variants: 3B, 7B, and 32B.
|
|
|
29 |
flash_attn: fa2
|
30 |
```
|
31 |
|
32 |
+
Training loss curve:
|
33 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/65af78bb3e82498d4c65ed2a/8BNyo-v68aFvab2kXxtt1.png"/>
|
34 |
+
|
35 |
## Usage
|
36 |
|
37 |
You can load the model using the Hugging Face `transformers` library:
|
|
|
40 |
from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration
|
41 |
import torch
|
42 |
|
43 |
+
MODEL_ID = "Fancy-MLLM/R1-Onevision-7B"
|
44 |
processor = AutoProcessor.from_pretrained(MODEL_ID, trust_remote_code=True)
|
45 |
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
|
46 |
MODEL_ID,
|
|
|
82 |
print(output_text)
|
83 |
```
|
84 |
|
|
|
|
|
|
|
85 |
## Ongoing Work
|
86 |
1. **Rule-Based Reinforcement Learning (RL)**
|
87 |
|