AI & ML interests

State-of-the-art Machine Learning for real-world robotics

Recent Activity

mishig  updated a Space 2 days ago
lerobot/visualize_dataset
aliberts  updated a dataset about 1 month ago
lerobot/utokyo_xarm_pick_and_place
aliberts  updated a dataset about 1 month ago
lerobot/utokyo_xarm_bimanual
View all activity

lerobot's activity

fdaudens 
posted an update 5 days ago
view post
Post
1129
🔍 From instruction-following to creative storytelling, dive into 2024's most impactful AI datasets! These gems are shaping everything from scientific research to video understanding.

Check it out: huggingface/open-source-ai-year-in-review-2024
fdaudens 
posted an update 7 days ago
view post
Post
1130
🤝 Want to share your AI models while protecting your work? Licenses are key!

Fascinating to see that nearly 60% of models on the Hub use Apache & MIT licenses.

Explore the viz here: huggingface/open-source-ai-year-in-review-2024
fdaudens 
posted an update 8 days ago
view post
Post
1260
Did a fun experiment: What are the main themes emerging from the 100+ Nieman Journalism Lab predictions for 2025?

I used natural language processing to cluster and map them — really helps spot patterns that weren't obvious when reading predictions one by one. So what will shape journalism next year? A lot of AI and US politics (surprise!), but there's also this horizontal axis that spans from industry strategies to deep reflections on how to talk to the public.

Click any dot to explore the original prediction. What themes surprise/interest you the most?

👉 fdaudens/nieman_lab_2025_predictions_visualization

P.s.: I discovered that Nieman Lab's content is under Creative Commons license!
fdaudens 
posted an update 10 days ago
fdaudens 
posted an update 13 days ago
thomwolf 
posted an update 16 days ago
view post
Post
4326
We are proud to announce HuggingFaceFW/fineweb-2: A sparkling update to HuggingFaceFW/fineweb with 1000s of 🗣️languages.

We applied the same data-driven approach that led to SOTA English performance in🍷 FineWeb to thousands of languages.

🥂 FineWeb2 has 8TB of compressed text data and outperforms other multilingual datasets in our experiments.

The dataset is released under the permissive 📜 ODC-By 1.0 license, and the 💻 code to reproduce it and our evaluations is public.

We will very soon announce a big community project, and are working on a 📝 blogpost walking you through the entire dataset creation process. Stay tuned!

In the mean time come ask us question on our chat place: HuggingFaceFW/discussion

H/t @guipenedo @hynky @lvwerra as well as @vsabolcec Bettina Messmer @negar-foroutan and @mjaggi
  • 2 replies
·
fdaudens 
posted an update 18 days ago
thomwolf 
posted an update 19 days ago
fdaudens 
posted an update 20 days ago
view post
Post
329
🎯 New day, new viz!

This teaser barely captures the heat between Meta 🇺🇸, Stability 🇬🇧 & Black Forest Labs 🇩🇪 racing for HF Hub likes. Want to see the full Fast & Furious AI showdown? Check the link below! 🏎️💨

huggingface/open-source-ai-year-in-review-2024
thomwolf 
posted an update 21 days ago
fdaudens 
posted an update 21 days ago
view post
Post
1053
📈👀 Just dropped: visualization mapping Hugging Face's most liked & downloaded models from 2022 to now. Small models are clearly on the rise - fascinating shift in both likes and download patterns.

Check it out: huggingface/open-source-ai-year-in-review-2024
fdaudens 
posted an update 22 days ago
view post
Post
1734
Keeping up with open-source AI in 2024 = overwhelming.

Here's help: We're launching our Year in Review on what actually matters, starting today!

Fresh content dropping daily until year end. Come along for the ride - first piece out now with @clem 's predictions for 2025.

Think of it as your end-of-year AI chocolate calendar.

Kudos to @BrigitteTousi @clefourrier @Wauplin @thomwolf for making it happen. We teamed up with aiworld.eu for awesome visualizations to make this digestible—it's a charm to work with their team.

Check it out: huggingface/open-source-ai-year-in-review-2024
fdaudens 
posted an update 26 days ago
fdaudens 
posted an update 27 days ago
view post
Post
1010
The rapid progress in small audio models is mind-blowing! 🤯 Just tested OuteTTS v0.2 - cloned my voice from a 10s clip with impressive accuracy and natural prosody.

At 500M parameters, it's efficient enough to run on basic hardware but powerful enough for professional use.

This could transform how we produce audio content for new - think instant translated interviews keeping original voices, or scaled audio article production!

Demo and Model on the Hub: OuteAI/OuteTTS-0.2-500M h/t @reach-vb
  • 3 replies
·
fdaudens 
posted an update 30 days ago
view post
Post
1293
🤖 93% of Gen Z workers use AI tools weekly, but nearly half of all workers aren't comfortable admitting it. The tech adoption gap isn't about usage—it's about openness. Why are we still treating AI tools like a workplace secret? 🤔

See this article: https://www.axios.com/2024/11/25/gen-z-ai-work-survey
  • 1 reply
·
thomwolf 
posted an update about 1 month ago
fdaudens 
posted an update about 1 month ago
view post
Post
1884
🦋 Hug the butterfly! You can now add your Bluesky handle to your Hugging Face profile! ✨
BrigitteTousi 
posted an update about 1 month ago
fdaudens 
posted an update about 1 month ago
view post
Post
2090
🚀 DeepSeek just dropped DeepSeek-R1-Lite-Preview with “reasoning” capacity.

- Matches OpenAI o1-preview on AIME & MATH benchmarks.
- Transparent process output
- Open-source model to be released

Try it out: https://chat.deepseek.com/