AIDC-AI
/

Parrot-7B

@@ -10,30 +10,45 @@ language:
 - en
 ---
-# Model Card
-<!-- Provide a quick summary of what the model is/does. -->
-Parrot is a multi-language and multi-modal large language model capable of achieving excellent performance.
 For a comprehensive introduction, please refer to [Parrot Paper](https://arxiv.org/abs/2406.02539) and [Parrot GitHub](https://github.com/AIDC-AI/Parrot).
-# Model Details
-![](https://github.com/AIDC-AI/Parrot/images/teaser.png)
-# Performance
-![](https://github.com/AIDC-AI/Parrot/images/performance.png)
-![](https://github.com/AIDC-AI/Parrot/images/performance_table.png)
-# Usage
-Below is a code snippet to run Parrot with multimodal inputs. For additional usage instructions, including inference wrapper and Gradio UI, please refer to [Parrot GitHub](https://github.com/AIDC-AI/Parrot).
-```markdown
-pip install torch==2.1.2 transformers==4.43.2 pillow==10.3.0
-```
-```python
-import torch
-from PIL import Image
-from transformers import AutoModelForCausalLM
-```
-# Citation
 If you find Parrot useful, please cite the paper
 ```markdown
@@ -45,5 +60,8 @@ If you find Parrot useful, please cite the paper
 }
 ```
-# License
 The project is licensed under Apache License Version 2.0 and is restricted to uses that comply with the license agreements of Qwen and Clip.

 - en
 ---
+# Parrot-7B
+## Introduction
+Welcome to Parrot, a novel method that utilizes textual guidance to drive visual token alignment at the language level.
+Parrot makes the visual tokens condition on diverse language inputs and uses Mixture-of-Experts (MoE) to promote the alignment of multilingual tokens.
+Moreover, considering the current lack of benchmarks for evaluating multilingual capabilities within the field, we collect and make available a Massive Multilingual Multimodal Benchmark which includes 6 languages, 15 categories, and 12,000 questions, named as MMMB.
 For a comprehensive introduction, please refer to [Parrot Paper](https://arxiv.org/abs/2406.02539) and [Parrot GitHub](https://github.com/AIDC-AI/Parrot).
+## Model
+Parrot is a multilingual multimodal large language model. We provide our fully finetuned models below:
+| Model | Base LLM | Vision Encoder | Stage | Download |
+| --- | --- | :---: | :---: | :---: |
+| Parrot-7B | Qwen-1.5-7B-Chat | CLIP-ViT-Large-patch14-336 | SFT | [ckpt](https://huggingface.co/AIDC-AI/Parrot-7B) |
+| Parrot-14B | Qwen-1.5-14B-Chat | CLIP-ViT-Large-patch14-336 | SFT | [ckpt](https://huggingface.co/AIDC-AI/Parrot-14B) |
+<div align="center">
+  <img src="https://cdn-uploads.huggingface.co/production/uploads/6076587d310e510df1db14bc/FAfbL6IqE7ZJdcx4_qQlF.png" width="600px" />
+</div>
+## Performance
+<div align="center">
+    <img src="https://cdn-uploads.huggingface.co/production/uploads/6076587d310e510df1db14bc/ZmTOkUZk5_UC1t0ExSmjM.png" width="400px" />
+</div>
+<div align="center">
+    <img src="https://cdn-uploads.huggingface.co/production/uploads/6076587d310e510df1db14bc/Njdnvzcx7BsH7HkK-ylRo.png" width="100%" />
+</div>
+## Quick Start
+We provide a quick start demo in [Parrot GitHub](https://github.com/AIDC-AI/Parrot), which can be used as a template to run Parrot for inference.
+1. Before running the demo, please make sure you download the [Parrot checkpoint](https://huggingface.co/AIDC-AI/Parrot-7B) and the [Clip checkpoint](https://huggingface.co/openai/clip-vit-large-patch14-336).
+2. Second, you should replace the paths in the `runner.py`.
+3. Finally, run the python file in your system.
+## Citation
 If you find Parrot useful, please cite the paper
 ```markdown
 }
 ```
+## License
 The project is licensed under Apache License Version 2.0 and is restricted to uses that comply with the license agreements of Qwen and Clip.
+## Disclaimer
+We used compliance-checking algorithms during the training process, to ensure the compliance of the trained model to the best of our ability. Due to the complexity of the data and the diversity of language model usage scenarios, we cannot guarantee that the model is completely free of copyright issues or improper content. If you believe anything infringes on your rights or generates improper content, please contact us, and we will promptly address the matter.