Oumayma Essarhi's picture
1 6

Oumayma Essarhi

oumayma03

AI & ML interests

None yet

Recent Activity

Organizations

Arabic Machine Learning 's profile picture Mixed Arabic Datasets's profile picture Hugging Face Discord Community's profile picture

oumayma03's activity

upvoted an article 3 days ago
view article
Article

Darija Chatbot Arena: Making LLMs Compete in the Moroccan Dialect

By atlasia and 2 others โ€ข
โ€ข 8
reacted to yuexiang96's post with ๐Ÿš€ 4 months ago
view post
Post
3066
๐ŸŒ Iโ€™ve always had a dream of making AI accessible to everyone, regardless of location or language. However, current open MLLMs often respond in English, even to non-English queries!

๐Ÿš€ Introducing Pangea: A Fully Open Multilingual Multimodal LLM supporting 39 languages! ๐ŸŒโœจ

https://neulab.github.io/Pangea/
https://arxiv.org/pdf/2410.16153

The Pangea family includes three major components:
๐Ÿ”ฅ Pangea-7B: A state-of-the-art multilingual multimodal LLM capable of 39 languages! Not only does it excel in multilingual scenarios, but it also matches or surpasses English-centric models like Llama 3.2, Molmo, and LlavaOneVision in English performance.

๐Ÿ“ PangeaIns: A 6M multilingual multimodal instruction tuning dataset across 39 languages. ๐Ÿ—‚๏ธ With 40% English instructions and 60% multilingual instructions, it spans various domains, including 1M culturally-relevant images sourced from LAION-Multi. ๐ŸŽจ

๐Ÿ† PangeaBench: A comprehensive evaluation benchmark featuring 14 datasets in 47 languages. Evaluation can be tricky, so we carefully curated existing benchmarks and introduced two new datasets: xChatBench (human-annotated wild queries with fine-grained evaluation criteria) and xMMMU (a meticulously machine-translated version of MMMU).

Check out more details: https://x.com/xiangyue96/status/1848753709787795679
liked a Space over 1 year ago