Mohamed Salama PRO

Salama1429

AI & ML interests

NLP

Recent Activity

liked a dataset about 1 month ago
UBC-NLP/Casablanca
liked a model about 1 month ago
minishlab/M2V_multilingual_output
View all activity

Organizations

Social Post Explorers's profile picture Hugging Face Discord Community's profile picture AI Starter Pack's profile picture

Salama1429's activity

New activity in Salama1429/tarteel-ai-everyayah-Quran about 2 months ago

Test missing

1
#2 opened 10 months ago by
HadiSDev
New activity in inceptionai/jais-family-2p7b-chat 5 months ago

Update README.md

#1 opened 5 months ago by
Salama1429
reacted to their post with πŸ‘€πŸ€πŸ‘πŸ§ πŸ€— 7 months ago
view post
Post
2474
πŸ“Ί Introducing the YouTube-Commons Dataset πŸ“Ί

🌐 Overview: The YouTube Commons Dataset is a comprehensive collection of 30 billion words from 15,112,121 original and automatically translated transcripts, drawn from 2,063,066 videos on YouTube.

πŸ”— License: All videos are shared under the CC-BY license, with the majority (71%) in English.

πŸ€– Applications: This dataset is ideal for training powerful AI models for converting speech to text (ASR) and translation models.

πŸ“Š Utilization: The text can be used for model training and is republishable for reproducibility purposes.

🀝 Collaboration: This dataset is the result of a collaboration between state start-up LANGU:IA, the French Ministry of Culture, and DINUM. It will be expanded in the coming months.

πŸ”— Explore the dataset here: https://lnkd.in/d_paWKFE

#YouTubeCommons #AIResearch #MachineLearning #OpenData #ArtificialIntelligence #NLP #Dataset #TechCollaboration #Innovation #DigitalTransformation