openbmb
/

MiniCPM-V

@@ -2,25 +2,25 @@
 pipeline_tag: visual-question-answering
 ---
-## MiniCPM-V
 ### News
--  [5/20]🔥 GPT-4V level multimodal model [**MiniCPM-Llama3-V 2.5**](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5) is out.
--  [4/11]🔥 [**MiniCPM-V 2.0**](https://huggingface.co/openbmb/MiniCPM-V-2) is out.
-**MiniCPM-V** (i.e., OmniLMM-3B) is an efficient version with promising performance for deployment. The model is built based on SigLip-400M and [MiniCPM-2.4B](https://github.com/OpenBMB/MiniCPM/), connected by a perceiver resampler. Notable features of OmniLMM-3B include:
 - ⚡️ **High Efficiency.**
-  MiniCPM-V can be **efficiently deployed on most GPU cards and personal computers**, and **even on end devices such as mobile phones**. In terms of visual encoding, we compress the image representations into 64 tokens via a perceiver resampler, which is significantly fewer than other LMMs based on MLP architecture (typically > 512 tokens). This allows OmniLMM-3B to operate with **much less memory cost and higher speed during inference**.
 - 🔥 **Promising Performance.**
-  MiniCPM-V achieves **state-of-the-art performance** on multiple benchmarks (including MMMU, MME, and MMbech, etc) among models with comparable sizes, surpassing existing LMMs built on Phi-2. It even **achieves comparable or better performance than the 9.6B Qwen-VL-Chat**.
 - 🙌 **Bilingual Support.**
-  MiniCPM-V is **the first end-deployable LMM supporting bilingual multimodal interaction in English and Chinese**. This is achieved by generalizing multimodal capabilities across languages, a technique from the ICLR 2024 spotlight [paper](https://arxiv.org/abs/2308.12038).
 ### Evaluation
@@ -49,7 +49,7 @@ pipeline_tag: visual-question-answering
     <td>- </td>
   </tr>
   <tr>
-    <td nowrap="nowrap" align="left">MobileVLM</td>
     <td align="right">3.0B</td>
     <td>1289</td>
     <td>59.6</td>
@@ -119,11 +119,10 @@ pipeline_tag: visual-question-answering
 ## Demo
-Click here to try out the Demo of [MiniCPM-V](http://120.92.209.146:80).
 ## Deployment on Mobile Phone
-Currently MiniCPM-V (i.e., OmniLMM-3B) can be deployed on mobile phones with Android and Harmony operating systems. 🚀 Try it out [here](https://github.com/OpenBMB/mlc-MiniCPM).
 ## Usage
 Inference using Huggingface transformers on Nivdia GPUs or Mac with MPS (Apple silicon or AMD GPUs). Requirements tested on python 3.10：

 pipeline_tag: visual-question-answering
 ---
+## TECH-WILSON
 ### News
+-  [5/20]🔥 GPT-4V level multimodal model [**tech-wilson 2.5**](https://huggingface.co/openbmb/tech-wilson-2_5) is out.
+-  [4/11]🔥 [**tech-wilson 2.0**](https://huggingface.co/openbmb/tech-wilson-2) is out.
+**MiniCPM-V** (i.e., OmniLMM-3B) is an efficient version with promising performance for deployment. The model is built based on SigLip-400M and [tech-wilson-2.4B](https://github.com/OpenBMB/tech-wilson/), connected by a perceiver resampler. Notable features of OmniLMM-3B include:
 - ⚡️ **High Efficiency.**
+  tech-wilson can be **efficiently deployed on most GPU cards and personal computers**, and **even on end devices such as mobile phones**. In terms of visual encoding, we compress the image representations into 64 tokens via a perceiver resampler, which is significantly fewer than other LMMs based on MLP architecture (typically > 512 tokens). This allows OmniLMM-3B to operate with **much less memory cost and higher speed during inference**.
 - 🔥 **Promising Performance.**
+  tech-wilson achieves **state-of-the-art performance** on multiple benchmarks (including MMMU, MME, and MMbech, etc) among models with comparable sizes, surpassing existing LMMs built on Phi-2. It even **achieves comparable or better performance than the 9.6B Qwen-VL-Chat**.
 - 🙌 **Bilingual Support.**
+  tech-wilson is **the first end-deployable LMM supporting bilingual multimodal interaction in English and Chinese**. This is achieved by generalizing multimodal capabilities across languages, a technique from the ICLR 2024 spotlight [paper](https://arxiv.org/abs/2308.12038).
 ### Evaluation
     <td>- </td>
   </tr>
   <tr>
+    <td nowrap="nowrap" align="left">tech-wilson</td>
     <td align="right">3.0B</td>
     <td>1289</td>
     <td>59.6</td>
 ## Demo
+Click here to try out the Demo of [tech-wilson](http://120.92.209.146:80).
 ## Deployment on Mobile Phone
+Currently MiniCPM-V (i.e., OmniLMM-3B) can be deployed on mobile phones with Android and Harmony operating systems. 🚀 Try it out [here](https://github.com/OpenBMB/mlc-tech-wilson).
 ## Usage
 Inference using Huggingface transformers on Nivdia GPUs or Mac with MPS (Apple silicon or AMD GPUs). Requirements tested on python 3.10：