Big Science Social Impact Evaluation for Bias and Stereotypes

community

AI & ML interests

datasets, social impact, bias, evaluation

Recent Activity

LanguageShades's activity

fdaudensย 
posted an update about 7 hours ago
view post
Post
200
Did we just drop personalized AI evaluation?! This tool auto-generates custom benchmarks on your docs to test which models are the best.

Most benchmarks test general capabilities, but what matters is how models handle your data and tasks. YourBench helps answer critical questions like:
- Do you really need a hundreds-of-billions-parameter model sledgehammer to crack a nut?
- Could a smaller, fine-tuned model work better?
- How well do different models understand your domain?

Some cool features:
๐Ÿ“š Generates custom benchmarks from your own documents (PDFs, Word, HTML)
๐ŸŽฏ Tests models on real tasks, not just general capabilities
๐Ÿ”„ Supports multiple models for different pipeline stages
๐Ÿง  Generate both single-hop and multi-hop questions
๐Ÿ” Evaluate top models and deploy leaderboards instantly
๐Ÿ’ฐ Full cost analysis to optimize for your budget
๐Ÿ› ๏ธ Fully configurable via a single YAML file

26 SOTA models tested for question generation. Interesting finding: Qwen2.5 32B leads in question diversity, while smaller Qwen models and Gemini 2.0 Flash offer great value for cost.

You can also run it locally on any models you want.

I'm impressed. Try it out: yourbench/demo
fdaudensย 
posted an update 2 days ago
view post
Post
1564
๐Ÿ”ฅ DeepSeek vibe coding with DeepSite is going viral with awesome projects!

From games to stunning visualizations, 7 wild examples:

๐Ÿ“บ AI TV with custom channels and animations https://x.com/_akhaliq/status/1905747381951545647

๐Ÿš€ Earth to Moon spacecraft journey visualization
Watch this incredible Three.js space simulation with zero external assets:
https://x.com/_akhaliq/status/1905836902533451999

๐Ÿ’ฃ Minesweeper in 2.5 minutes! Built & deployed instantly on DeepSite. Zero setup needed:
https://x.com/cholf5/status/1906031928937218334

๐ŸŽฎ Asked for Game of Life, got a masterpiece. Simple prompt, complex features. See it in action: https://x.com/pbeyssac/status/1906304454824992844

๐Ÿ’ซ One-shot anime website with perfect UI. DeepSite turned a simple request into a fully-functional anime site: https://x.com/risphereeditor/status/1905961725028913264

๐Ÿ“Š 10-minute World Indicators Dashboard. Just described what I wanted and got a full interactive dashboard! https://x.com/i/status/1906345214089785634

โœจ Ready to build without coding? Imagine it. Build it. Share it! enzostvs/deepsite
fdaudensย 
posted an update 4 days ago
view post
Post
2003
Want to vibecode with DeepSeek? Just spent 10 minutes with this space and created a full world indicators dashboard - literally just by describing what I wanted!

Anyone can now prototype and deploy projects instantly.

Try out the app: enzostvs/deepsite

My dashboard: fdaudens/world-indicators
fdaudensย 
posted an update 7 days ago
view post
Post
1886
Want to ramp up your AI skills and start breaking bigger stories? With the Journalists on Hugging Face community, we're launching our first learn-together course!

We'll build AI classifiers that process months of data in minutes. How?

- Work through an interactive version of an excellent course developed by Ben Welsh and Derek Willis
- Share findings and get help in our dedicated community channel
- Build working classifiers you can use in your reporting today

No coding background needed - if you can write a ChatGPT or Claude prompt, you can do this. Journalists are already using these techniques to break stories, from uncovering hidden real estate deals to tracking unusual campaign spending.

Join usโ€”it might give you your next big story!

Thanks to Ben and Derek for letting me adapt their excellent course into this interactive version!

- Check out the course: JournalistsonHF/first-llm-classifier

- Join our Slack community to learn together: https://docs.google.com/forms/d/e/1FAIpQLSfyA7G6Y9q-5hDBSnGc3CFtg9H8fjqKCCuieptXuTqRudGNjQ/viewform
giadapย 
posted an update 8 days ago
view post
Post
2262
We've all become experts at clicking "I agree" without a second thought. In my latest blog post, I explore why these traditional consent models are increasingly problematic in the age of generative AI.

I found three fundamental challenges:
- Scope problem: how can you know what you're agreeing to when AI could use your data in different ways?
- Temporality problem: once an AI system learns from your data, good luck trying to make it "unlearn" it.
- Autonomy trap: the data you share today could create systems that pigeonhole you tomorrow.

Individual users shouldn't bear all the responsibility, while big tech holds all the cards. We need better approaches to level the playing field, from collective advocacy and stronger technological safeguards to establishing "data fiduciaries" with a legal duty to protect our digital interests.

Available here: https://huggingface.co/blog/giadap/beyond-consent
fdaudensย 
posted an update 13 days ago
view post
Post
2091
๐ŸŽฅ Just tested Stability AI's Stable Virtual Camera - it turns a single photo into dynamic video with AI-powered camera movements! From static meeting room to cinematic sweeps. ๐Ÿš€

Try it out: stabilityai/stable-virtual-camera
fdaudensย 
posted an update 14 days ago
view post
Post
1938
๐Ÿ”Š Meet Orpheus: A breakthrough open-source TTS model that matches human-level speech with empathy & emotion.
- Available in 4 sizes (150M-3B parameters)
- delivers ultra-fast streaming
- zero-shot voice cloning.
- Apache 2.0 license

canopylabs/orpheus-tts-67d9ea3f6c05a941c06ad9d2
  • 1 reply
ยท
fdaudensย 
posted an update 16 days ago
view post
Post
2285
Want to build useful newsroom tools with AI? Weโ€™re launching a Hugging Face x Journalism Slack channel where journalists turn AI concepts into real newsroom solutions.

Inside the community:
โœ… Build open-source AI tools for journalism
โœ… Get direct help from the community
โœ… Stay updated on new models and datasets
โœ… Learn from other journalistsโ€™ experiments and builds

The goal? Go from โ€œI read about AIโ€ to โ€œI built an AI tool that supercharged my newsroom.โ€ โ€”no more learning in isolation.

Join us! https://join.slack.com/t/journalistson-tnd8294/shared_invite/zt-30vsmhk4w-dZpeMOoxdhCvfNsqtspPUQ (Please make sure to use a clear identityโ€”no teddybear85, for example ๐Ÿ˜‰)

(If you know people who might be interested, tag them below! The more minds we bring in, the better the tools we build.)

fdaudensย 
posted an update 16 days ago
fdaudensย 
posted an update 20 days ago
view post
Post
891
๐Ÿคฏ Gemma 3's image analysis blew me away!

Tested 2 ways to extract airplane registration numbers from photos with 12B model:

1๏ธโƒฃ Gradio app w/API link (underrated feature IMO) + ZeroGPU infra on Hugging Face in Google Colab. Fast & free.

2๏ธโƒฃ LMStudio + local processing (100% private). Running this powerhouse on a MacBook w/16GB RAM is wild! ๐Ÿš€

Colab: https://colab.research.google.com/drive/1YmmaP0IDEu98CLDppAAK9kbQZ7lFnLZ1?usp=sharing
fdaudensย 
posted an update 21 days ago
view post
Post
1473
Ever wanted 45 min with one of AIโ€™s most fascinating minds? Was with @thomwolf at HumanX Vegas. Sharing my notes of his Q&A with the pressโ€”completely changed how I think about AIโ€™s future:

1๏ธโƒฃ The next wave of successful AI companies wonโ€™t be defined by who has the best model but by who builds the most useful real-world solutions. "We all have engines in our cars, but thatโ€™s rarely the only reason we buy one. We expect it to work well, and thatโ€™s enough. LLMs will be the same."

2๏ธโƒฃ Big players are pivoting: "Closed-source companiesโ€”OpenAI being the firstโ€”have largely shifted from LLM announcements to product announcements."

3๏ธโƒฃ Open source is changing everything: "DeepSeek was open source AIโ€™s ChatGPT moment. Basically, everyone outside the bubble realized you can get a model for freeโ€”and itโ€™s just as good as the paid ones."

4๏ธโƒฃ Product innovation is being democratized: Take Manus, for exampleโ€”they built a product on top of Anthropicโ€™s models thatโ€™s "actually better than Anthropicโ€™s own product for now, in terms of agents." This proves that anyone can build great products with existing models.

Weโ€™re entering a "multi-LLM world," where models are becoming commoditized, and all the tools to build are readily availableโ€”just look at the flurry of daily new releases on Hugging Face.

Thom's comparison to the internet era is spot-on: "In the beginning you made a lot of money by making websites... but nowadays the huge internet companies are not the companies that built websites. Like Airbnb, Uber, Facebook, they just use the internet as a medium to make something for real life use cases."

Love to hear your thoughts on this shift!
  • 1 reply
ยท
fdaudensย 
posted an update 22 days ago
view post
Post
1794
๐Ÿ”ฅThe Open R1 team just dropped OlympicCoder and it's wild:

- 7B model outperforms Claude 3.7 Sonnet on IOI benchmark (yes, 7B!!)
- 32B crushes all open-weight models tested, even those 100x larger ๐Ÿคฏ

Open-sourcing the future of code reasoning! ๐Ÿš€

Check it out https://huggingface.co/blog/open-r1/update-3
fdaudensย 
posted an update 24 days ago
view post
Post
5732
Honored to be named among their 12 pioneers and power players in the news industry in the 2025 Tech Trends Report from Future Today Strategy Group.

Incredible group to be part of - each person is doing groundbreaking work at the intersection of AI and journalism. Worth following them all: they're consistently sharing practical insights on building the future of news.

Take the time to read this report, it's packed with insights as always. The news & information section's #1 insight hits hard: "The most substantive economic impact of AI to date has been licensing payouts for a handful of big publishers. The competition will start shifting in the year ahead to separate AI 'haves' that have positioned themselves to grow from the 'have-nots.'"

This AI-driven divide is something I've been really concerned about. Now is the time to build more than ever!

๐Ÿ‘‰ Full report here: https://ftsg.com/wp-content/uploads/2025/03/FTSG_2025_TR_FINAL_LINKED.pdf
  • 2 replies
ยท
fdaudensย 
posted an update 28 days ago
view post
Post
4097
AI will bring us "a country of yes-men on servers" instead of one of "Einsteins sitting in a data center" if we continue on current trends.

Must-read by @thomwolf deflating overblown AI promises and explaining what real scientific breakthroughs require.

https://thomwolf.io/blog/scientific-ai.html
  • 2 replies
ยท
fdaudensย 
posted an update about 1 month ago
view post
Post
3450
What if AI becomes as ubiquitous as the internet, but runs locally and transparently on our devices?

Fascinating TED talk by @thomwolf on open source AI and its future impact.

Imagine this for AI: instead of black box models running in distant data centers, we get transparent AI that runs locally on our phones and laptops, often without needing internet access. If the original team moves on? No problem - resilience is one of the beauties of open source. Anyone (companies, collectives, or individuals) can adapt and fix these models.

This is a compelling vision of AI's future that solves many of today's concerns around AI transparency and centralized control.

Watch the full talk here: https://www.ted.com/talks/thomas_wolf_what_if_ai_just_works
  • 1 reply
ยท