File size: 1,723 Bytes
24f69e1 eaf70e9 dca7f97 eaf70e9 dca7f97 eaf70e9 24f69e1 68e84c7 24f69e1 68e84c7 24f69e1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
---
license: apache-2.0
datasets:
- neulab/PangeaInstruct
language:
- am
- ar
- bg
- bn
- cs
- de
- el
- en
- es
- fa
- fr
- ga
- hi
- id
- ig
- it
- iw
- ja
- jv
- ko
- nl
- mn
- ms
- no
- pl
- pt
- ro
- ru
- si
- su
- sw
- ta
- te
- th
- tr
- uk
- ur
- vi
- zh
base_model:
- Qwen/Qwen2-7B-Instruct
---
# Pangea-7B Model Card
[π Homepage](https://neulab.github.io/Pangea/) | [π€ Pangea-7B](https://huggingface.co/neulab/Pangea-7B) | [π PangeaIns](https://huggingface.co/datasets/neulab/PangeaInstruct) | [π§ͺ PangeaBench](https://huggingface.co/collections/neulab/pangea-6713c3b0d78a453906eb2ed8) | [π» Github](https://github.com/neulab/Pangea/tree/main) | [π Arxiv](https://arxiv.org/abs/2410.16153) | [π PDF](https://arxiv.org/pdf/2410.16153)
## Model details
- **Model:** Pangea is a fully open-source Multilingual Multimodal Multicultural LLM.
- **Date:** Pangea-7B was trained in 2024.
- **Training Dataset:** [6M PangeaIns](https://huggingface.co/datasets/neulab/PangeaInstruct).
## Uses
### Direct Use
```python
from transformers import AutoProcessor, AutoModelForCausalLM
processor = AutoProcessor.from_pretrained("neulab/Pangea-7B")
model = AutoModelForCausalLM.from_pretrained("neulab/Pangea-7B")
```
## Citing the Model
**BibTeX Citation:**
```
@article{yue2024pangeafullyopenmultilingual,
title={Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages},
author={Xiang Yue and Yueqi Song and Akari Asai and Seungone Kim and Jean de Dieu Nyandwi and Simran Khanuja and Anjali Kantharuban and Lintang Sutawika and Sathyanarayanan Ramamoorthy and Graham Neubig},
year={2024},
journal={arXiv preprint arXiv:2410.16153},
url={https://arxiv.org/abs/2410.16153}
}
```
|