Andrea Soria

asoria

AI & ML interests

Maintainer of 🤗Datasets: Data processing

Articles

Organizations

asoria's activity

upvoted an article 8 days ago
view article
Article

Introducing BERTopic Integration with Hugging Face Hub

6
upvoted 2 articles 9 days ago
view article
Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

161
upvoted an article 11 days ago
view article
Article

Introducing the SQL Console on Datasets

17
upvoted an article 18 days ago
view article
Article

Fine-Tuning Gemma Models in Hugging Face

22
upvoted an article about 1 month ago
view article
Article

The 5 Most Under-Rated Tools on Hugging Face

81
upvoted an article about 2 months ago
view article
Article

SmolLM - blazingly fast and remarkably powerful

244
upvoted 3 articles 2 months ago
view article
Article

Docmatix - a huge dataset for Document Visual Question Answering

65
view article
Article

Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

61
view article
Article

Ethics and Society Newsletter #6: Building Better AI: The Importance of Data Quality

30
upvoted 2 articles 3 months ago
view article
Article

Experimenting with Automatic PII Detection on the Hub using Presidio

23
view article
Article

Announcing New Dataset Search Features

22
upvoted 2 articles 4 months ago
view article
Article

How to directly access 150k+ Hugging Face Datasets with DuckDB and query using GPT-4o

By chilijung
10
view article
Article

Synthetic dataset generation techniques: generating custom sentence similarity data

14
upvoted 3 articles 5 months ago
view article
Article

Synthetic data: save money, time and carbon with open source

46
view article
Article

🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets

69
view article
Article

Text2SQL using Hugging Face Dataset Viewer API and Motherduck DuckDB-NSQL-7B

23
upvoted 2 articles 6 months ago
view article
Article

It's raining diffusion personalization techniques☔️🎭🖼️

By linoyts
18
view article
Article

DuckDB: run SQL queries on 50,000+ datasets on the Hugging Face Hub

4