Safetensors
qwen2
File size: 2,390 Bytes
24f69e1
 
 
 
eaf70e9
dca7f97
 
 
 
 
 
 
eaf70e9
dca7f97
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eaf70e9
 
24f69e1
 
 
018ce8f
 
 
 
68e84c7
 
24f69e1
018ce8f
 
 
24f69e1
 
68e84c7
24f69e1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
license: apache-2.0
datasets:
- neulab/PangeaInstruct
language:
- am
- ar
- bg
- bn
- cs
- de
- el
- en
- es
- fa
- fr
- ga
- hi
- id
- ig
- it
- iw
- ja
- jv
- ko
- nl
- mn
- ms
- no
- pl
- pt
- ro
- ru
- si
- su
- sw
- ta
- te
- th
- tr
- uk
- ur
- vi
- zh
base_model:
- Qwen/Qwen2-7B-Instruct
---
# Pangea-7B Model Card

[Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages](https://neulab.github.io/Pangea/)

๐Ÿ‡ช๐Ÿ‡น ๐Ÿ‡ธ๐Ÿ‡ฆ ๐Ÿ‡ง๐Ÿ‡ฌ ๐Ÿ‡ง๐Ÿ‡ฉ ๐Ÿ‡จ๐Ÿ‡ฟ ๐Ÿ‡ฉ๐Ÿ‡ช ๐Ÿ‡ฌ๐Ÿ‡ท ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ ๐Ÿ‡ช๐Ÿ‡ธ ๐Ÿ‡ฎ๐Ÿ‡ท ๐Ÿ‡ซ๐Ÿ‡ท ๐Ÿ‡ฎ๐Ÿ‡ช ๐Ÿ‡ฎ๐Ÿ‡ณ ๐Ÿ‡ฎ๐Ÿ‡ฉ ๐Ÿ‡ณ๐Ÿ‡ฌ ๐Ÿ‡ฎ๐Ÿ‡น ๐Ÿ‡ฎ๐Ÿ‡ฑ ๐Ÿ‡ฏ๐Ÿ‡ต ๐Ÿ‡ฎ๐Ÿ‡ฉ ๐Ÿ‡ฐ๐Ÿ‡ท ๐Ÿ‡ณ๐Ÿ‡ฑ ๐Ÿ‡ฒ๐Ÿ‡ณ ๐Ÿ‡ฒ๐Ÿ‡พ ๐Ÿ‡ณ๐Ÿ‡ด ๐Ÿ‡ต๐Ÿ‡ฑ ๐Ÿ‡ต๐Ÿ‡น ๐Ÿ‡ง๐Ÿ‡ท ๐Ÿ‡ท๐Ÿ‡ด ๐Ÿ‡ท๐Ÿ‡บ ๐Ÿ‡ฑ๐Ÿ‡ฐ ๐Ÿ‡ฎ๐Ÿ‡ฉ ๐Ÿ‡ฐ๐Ÿ‡ช ๐Ÿ‡น๐Ÿ‡ฟ ๐Ÿ‡ฑ๐Ÿ‡ฐ ๐Ÿ‡ฎ๐Ÿ‡ณ ๐Ÿ‡ฎ๐Ÿ‡ณ ๐Ÿ‡น๐Ÿ‡ญ ๐Ÿ‡น๐Ÿ‡ท ๐Ÿ‡บ๐Ÿ‡ฆ ๐Ÿ‡ต๐Ÿ‡ฐ ๐Ÿ‡ฎ๐Ÿ‡ณ ๐Ÿ‡ป๐Ÿ‡ณ ๐Ÿ‡จ๐Ÿ‡ณ ๐Ÿ‡น๐Ÿ‡ผ

[๐Ÿ  Homepage](https://neulab.github.io/Pangea/) | [๐Ÿค– Pangea-7B](https://huggingface.co/neulab/Pangea-7B) | [๐Ÿ“Š PangeaIns](https://huggingface.co/datasets/neulab/PangeaInstruct) | [๐Ÿงช PangeaBench](https://huggingface.co/collections/neulab/pangea-6713c3b0d78a453906eb2ed8) | [๐Ÿ’ป Github](https://github.com/neulab/Pangea/tree/main) | [๐Ÿ“„ Arxiv](https://arxiv.org/abs/2410.16153) | [๐Ÿ“• PDF](https://arxiv.org/pdf/2410.16153)


<img src="https://cdn-uploads.huggingface.co/production/uploads/6230d750d93e84e233882dbc/ZjVTKnIsyshWpo-PWg9gM.png" alt="description" style="width:300px;">


## Model details

 - **Model:** Pangea is a fully open-source Multilingual Multimodal Multicultural LLM.
 - **Date:** Pangea-7B was trained in 2024.
 - **Training Dataset:** [6M PangeaIns](https://huggingface.co/datasets/neulab/PangeaInstruct).

## Uses

### Direct Use
```python
from transformers import AutoProcessor, AutoModelForCausalLM

processor = AutoProcessor.from_pretrained("neulab/Pangea-7B")
model = AutoModelForCausalLM.from_pretrained("neulab/Pangea-7B")
```

## Citing the Model

**BibTeX Citation:**

```
@article{yue2024pangeafullyopenmultilingual,
  title={Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages},
  author={Xiang Yue and Yueqi Song and Akari Asai and Seungone Kim and Jean de Dieu Nyandwi and Simran Khanuja and Anjali Kantharuban and Lintang Sutawika and Sathyanarayanan Ramamoorthy and Graham Neubig},
  year={2024},
  journal={arXiv preprint arXiv:2410.16153},
  url={https://arxiv.org/abs/2410.16153}
}
```