Fancy-MLLM
/

R1-Onevision-7B

Image-Text-to-Text

Model card Files Files and versions Community

Fancy-MLLM commited on 4 days ago

Commit

35badb2

·

verified ·

1 Parent(s): c19ee23

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -10,6 +10,7 @@ pipeline_tag: image-text-to-text
 ## R1-Onevision
 [\[📂 GitHub\]](https://github.com/Fancy-MLLM/R1-Onevision)[\[📝 Report\]](https://yangyi-vai.notion.site/r1-onevision?pvs=4)
 [\[🤗 HF Dataset\]](https://huggingface.co/datasets/Fancy-MLLM/R1-onevision)  [\[🤗 Reasoning Benchmark\]](https://huggingface.co/datasets/Fancy-MLLM/R1-OneVision-Bench) [\[🤗 HF Demo\]](https://huggingface.co/spaces/Fancy-MLLM/R1-OneVision)
 ## Model Overview
@@ -102,6 +103,7 @@ print(output_text)
     We are also focused on integrating Chinese multimodal reasoning CoT data into the training process. By adding this language-specific dataset, we aim to improve the model’s capability to perform reasoning tasks in Chinese, expanding its multilingual and multimodal reasoning proficiency.
 4. **Release of the 3B Model**
     We are working on the release of a smaller, more efficient 3B model, which is designed to provide a balance between performance and resource efficiency. This model aims to deliver strong multimodal reasoning capabilities while being more accessible and optimized for environments with limited computational resources, offering a more compact alternative to the current 7B model.

 ## R1-Onevision
 [\[📂 GitHub\]](https://github.com/Fancy-MLLM/R1-Onevision)[\[📝 Report\]](https://yangyi-vai.notion.site/r1-onevision?pvs=4)
+[\[🤗 HF Demo\]](https://huggingface.co/spaces/Fancy-MLLM/R1-Onevision)
 [\[🤗 HF Dataset\]](https://huggingface.co/datasets/Fancy-MLLM/R1-onevision)  [\[🤗 Reasoning Benchmark\]](https://huggingface.co/datasets/Fancy-MLLM/R1-OneVision-Bench) [\[🤗 HF Demo\]](https://huggingface.co/spaces/Fancy-MLLM/R1-OneVision)
 ## Model Overview
     We are also focused on integrating Chinese multimodal reasoning CoT data into the training process. By adding this language-specific dataset, we aim to improve the model’s capability to perform reasoning tasks in Chinese, expanding its multilingual and multimodal reasoning proficiency.
 4. **Release of the 3B Model**
     We are working on the release of a smaller, more efficient 3B model, which is designed to provide a balance between performance and resource efficiency. This model aims to deliver strong multimodal reasoning capabilities while being more accessible and optimized for environments with limited computational resources, offering a more compact alternative to the current 7B model.