Image-Text-to-Text
Transformers
Safetensors
English
MLLM
Inference Endpoints
liyang commited on
Commit
0069ff2
1 Parent(s): 49123f0

Update README.md

Browse files

update model card

Files changed (1) hide show
  1. README.md +49 -3
README.md CHANGED
@@ -1,3 +1,49 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - AIDC-AI/Parrot-dataset
5
+ library_name: transformers
6
+ tags:
7
+ - MLLM
8
+ pipeline_tag: image-text-to-text
9
+ language:
10
+ - en
11
+ ---
12
+
13
+ # Model Card
14
+ <!-- Provide a quick summary of what the model is/does. -->
15
+ Parrot is a multi-language and multi-modal large language model capable of achieving excellent performance.
16
+ For a comprehensive introduction, please refer to [Parrot Paper](https://arxiv.org/abs/2406.02539) and [Parrot GitHub](https://github.com/AIDC-AI/Parrot).
17
+
18
+ # Model Details
19
+ ![](https://github.com/AIDC-AI/Parrot/images/teaser.png)
20
+
21
+ # Performance
22
+ ![](https://github.com/AIDC-AI/Parrot/images/performance.png)
23
+ ![](https://github.com/AIDC-AI/Parrot/images/performance_table.png)
24
+ # Usage
25
+
26
+ Below is a code snippet to run Parrot with multimodal inputs. For additional usage instructions, including inference wrapper and Gradio UI, please refer to [Parrot GitHub](https://github.com/AIDC-AI/Parrot).
27
+ ```markdown
28
+ pip install torch==2.1.2 transformers==4.43.2 pillow==10.3.0
29
+ ```
30
+ ```python
31
+ import torch
32
+ from PIL import Image
33
+ from transformers import AutoModelForCausalLM
34
+ ```
35
+
36
+ # Citation
37
+ If you find Parrot useful, please cite the paper
38
+
39
+ ```markdown
40
+ @article{sun2024parrot,
41
+ title={Parrot: Multilingual Visual Instruction Tuning},
42
+ author={Sun, Hai-Long and Zhou, Da-Wei and Li, Yang and Lu, Shiyin and Yi, Chao and Chen, Qing-Guo and Xu, Zhao and Luo, Weihua and Zhang, Kaifu and Zhan, De-Chuan and others},
43
+ journal={arXiv preprint arXiv:2406.02539},
44
+ year={2024}
45
+ }
46
+ ```
47
+
48
+ # License
49
+ The project is licensed under Apache License Version 2.0 and is restricted to uses that comply with the license agreements of Qwen and Clip.