Manel ALOUI

Manel-Hik

AI & ML interests

NLP recommender system, machine learning

Recent Activity

Organizations

πŸ€— Course Team AI Law Assistant's profile picture LangChainDatasets's profile picture FreedomAI's profile picture fastai X Hugging Face Group 2022's profile picture Arabic Machine Learning 's profile picture Open Arabic LLM Leaderboard's profile picture Data Is Better Together Contributor's profile picture

Manel-Hik's activity

upvoted an article 14 days ago
view article
Article

The Open Arabic LLM Leaderboard 2

β€’ 26
published an article 15 days ago
view article
Article

The Open Arabic LLM Leaderboard 2

β€’ 26
reacted to joylarkin's post with πŸš€ 5 months ago
view post
Post
2632
πŸ’¬ Chat as a way to query SQL! The Airtrain AI team is happy to share a new Hugging Face Space that lets you interact with Hugging Face Hub datasets using a natural language chatbot. πŸ€—

Start Exploring πŸ‘‰ airtrain-ai/hf-dataset-chat-to-sql

This Space is forked from davidberenstein1957/text-to-sql-hub-datasetsΒ byΒ  @davidberenstein1957 and features chat capability with improved table naming. The tool works with Hugging Face’s recently released in-browser DuckDB-based SQL query engine for datasets.



reacted to Salama1429's post with πŸ‘ 6 months ago
view post
Post
1476
πŸ“š Introducing the 101 Billion Arabic Words Dataset

🌐 Exciting Milestone in Arabic Language Technology! hashtag#NLP hashtag#ArabicLLM hashtag#LanguageModels

πŸš€ Why It Matters:
1. 🌟 Large Language Models (LLMs) have brought transformative changes, primarily in English. It's time for Arabic to shine!
2. 🎯 This project addresses the critical challenge of bias in Arabic LLMs due to reliance on translated datasets.

πŸ” Approach:
1. πŸ’ͺ Undertook a massive data mining initiative focusing exclusively on Arabic from Common Crawl WET files.
2. 🧹 Employed state-of-the-art cleaning and deduplication processes to maintain data quality and uniqueness.

πŸ“ˆ Impact:
1. πŸ† Created the largest Arabic dataset to date with 101 billion words.
2. πŸ“ Enables the development of Arabic LLMs that are linguistically and culturally accurate.
3. 🌍 Sets a global benchmark for future Arabic language research.


πŸ”— Paper: https://lnkd.in/dGAiaygn
πŸ”— Dataset: https://lnkd.in/dGTMe5QV

- πŸ”„ Share your thoughts and let's drive the future of Arabic NLP together!

hashtag#DataScience hashtag#MachineLearning hashtag#ArtificialIntelligence hashtag#Innovation hashtag#ArabicData
New activity in silma-ai/silma-ar-custom-eval 6 months ago

Technical Report

#2 opened 6 months ago by
Manel-Hik
reacted to alielfilali01's post with πŸ€— 9 months ago
view post
Post
1988
I'm officially considered #gpu_poor πŸ’€
But I'm #data_rich 😎
upvoted an article 10 months ago
view article
Article

Introducing the Open Arabic LLM Leaderboard

β€’ 80