Brigitte Tousignant

BrigitteTousi

AI & ML interests

None yet

Recent Activity

liked a Space 3 days ago
victor/deepsite-gallery
View all activity

Organizations

Hugging Face's profile picture Society & Ethics's profile picture HuggingFaceM4's profile picture Open-Source AI Meetup's profile picture BigCode's profile picture Hugging Face OSS Metrics's profile picture IBM-NASA Prithvi Models Family's profile picture Hugging Face Smol Models Research's profile picture Wikimedia Movement's profile picture LeRobot's profile picture Journalists on Hugging Face's profile picture Women on Hugging Face's profile picture Social Post Explorers's profile picture Dev Mode Explorers's profile picture Hugging Face Science's profile picture Coordination Nationale pour l'IA's profile picture open/ acc's profile picture Bluesky Community's profile picture Sandbox's profile picture Open R1's profile picture

BrigitteTousi's activity

reacted to AdinaY's post with ๐Ÿ‘€ 5 days ago
view post
Post
1862
AutoGLM ๆฒ‰ๆ€๐Ÿ’ซ FREE AI Agent released by ZhipuAI

โœจ Think & Act simultaneously
โœจ Based on a fully self-developed stack: GLM-4 for general, GLM-Z1 for inference, and GLM-Z1-Rumination for rumination
โœจ Will openly share these models on April 14 ๐Ÿคฏ

Preview version๐Ÿ‘‰ https://autoglm-research.zhipuai.cn/?channel=autoglm_android
reacted to fdaudens's post with โค๏ธ 5 days ago
view post
Post
1696
๐Ÿ”ฅ DeepSeek vibe coding with DeepSite is going viral with awesome projects!

From games to stunning visualizations, 7 wild examples:

๐Ÿ“บ AI TV with custom channels and animations https://x.com/_akhaliq/status/1905747381951545647

๐Ÿš€ Earth to Moon spacecraft journey visualization
Watch this incredible Three.js space simulation with zero external assets:
https://x.com/_akhaliq/status/1905836902533451999

๐Ÿ’ฃ Minesweeper in 2.5 minutes! Built & deployed instantly on DeepSite. Zero setup needed:
https://x.com/cholf5/status/1906031928937218334

๐ŸŽฎ Asked for Game of Life, got a masterpiece. Simple prompt, complex features. See it in action: https://x.com/pbeyssac/status/1906304454824992844

๐Ÿ’ซ One-shot anime website with perfect UI. DeepSite turned a simple request into a fully-functional anime site: https://x.com/risphereeditor/status/1905961725028913264

๐Ÿ“Š 10-minute World Indicators Dashboard. Just described what I wanted and got a full interactive dashboard! https://x.com/i/status/1906345214089785634

โœจ Ready to build without coding? Imagine it. Build it. Share it! enzostvs/deepsite
reacted to fdaudens's post with ๐Ÿ”ฅ 9 days ago
view post
Post
1923
Want to ramp up your AI skills and start breaking bigger stories? With the Journalists on Hugging Face community, we're launching our first learn-together course!

We'll build AI classifiers that process months of data in minutes. How?

- Work through an interactive version of an excellent course developed by Ben Welsh and Derek Willis
- Share findings and get help in our dedicated community channel
- Build working classifiers you can use in your reporting today

No coding background needed - if you can write a ChatGPT or Claude prompt, you can do this. Journalists are already using these techniques to break stories, from uncovering hidden real estate deals to tracking unusual campaign spending.

Join usโ€”it might give you your next big story!

Thanks to Ben and Derek for letting me adapt their excellent course into this interactive version!

- Check out the course: JournalistsonHF/first-llm-classifier

- Join our Slack community to learn together: https://docs.google.com/forms/d/e/1FAIpQLSfyA7G6Y9q-5hDBSnGc3CFtg9H8fjqKCCuieptXuTqRudGNjQ/viewform
reacted to Kseniase's post with ๐Ÿ”ฅ 19 days ago
view post
Post
7745
15 types of attention mechanisms

Attention mechanisms allow models to dynamically focus on specific parts of their input when performing tasks. In our recent article, we discussed Multi-Head Latent Attention (MLA) in detail and now it's time to summarize other existing types of attention.

Here is a list of 15 types of attention mechanisms used in AI models:

1. Soft attention (Deterministic attention) -> Neural Machine Translation by Jointly Learning to Align and Translate (1409.0473)
Assigns a continuous weight distribution over all parts of the input. It produces a weighted sum of the input using attention weights that sum to 1.

2. Hard attention (Stochastic attention) -> Effective Approaches to Attention-based Neural Machine Translation (1508.04025)
Makes a discrete selection of some part of the input to focus on at each step, rather than attending to everything.

3. Self-attention -> Attention Is All You Need (1706.03762)
Each element in the sequence "looks" at other elements and "decides" how much to borrow from each of them for its new representation.

4. Cross-Attention (Encoder-Decoder attention) -> Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation (2104.08771)
The queries come from one sequence and the keys/values come from another sequence. It allows a model to combine information from two different sources.

5. Multi-Head Attention (MHA) -> Attention Is All You Need (1706.03762)
Multiple attention โ€œheadsโ€ are run in parallel.โ€‹ The model computes several attention distributions (heads), each with its own set of learned projections of queries, keys, and values.

6. Multi-Head Latent Attention (MLA) -> DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model (2405.04434)
Extends MHA by incorporating a latent space where attention heads can dynamically learn different latent factors or representations.

7. Memory-Based attention -> End-To-End Memory Networks (1503.08895)
Involves an external memory and uses attention to read from and write to this memory.

See other types in the comments ๐Ÿ‘‡
  • 1 reply
ยท
reacted to ginipick's post with ๐Ÿ”ฅ 23 days ago
view post
Post
4113
๐ŸŒ GraphMind: Phi-3 Instruct Graph Explorer

โœจ Extract and visualize knowledge graphs from any text in multiple languages!

GraphMind is a powerful tool that leverages the capabilities of Phi-3 to transform unstructured text into structured knowledge graphs, helping you understand complex relationships within any content.

ginigen/Graph-Mind

๐Ÿš€ Key Features

Multi-language Support ๐ŸŒ: Process text in English, Korean, and many other languages
Instant Visualization ๐Ÿงฉ: See extracted entities and relationships in an interactive graph
Entity Recognition ๐Ÿท๏ธ: Automatically identifies and categorizes named entities
Optimized Performance โšก: Uses caching to deliver faster results for common examples
Intuitive Interface ๐Ÿ‘†: Simple design makes complex graph extraction accessible to everyone

๐Ÿ’ก Use Cases

Content Analysis: Extract key entities and relationships from articles or documents
Research Assistance: Quickly visualize connections between concepts in research papers
Educational Tool: Help students understand the structure of complex texts
Multilingual Processing: Extract knowledge from content in various languages

๐Ÿ”ง How It Works

Enter any text in the input field
Select a model from the dropdown
Click "Extract & Visualize"
Explore the interactive knowledge graph and entity recognition results

GraphMind bridges the gap between raw text and structured knowledge, making it easier to identify patterns, extract insights, and understand relationships within any content. Try it now and transform how you interact with textual information!
#NLP #KnowledgeGraph #TextAnalysis #Visualization #Phi3 #MultilingualAI
  • 1 reply
ยท
replied to burtenshaw's post 23 days ago
view reply

brb making a PR to include dog emoji reaction

reacted to burtenshaw's post with ๐Ÿ”ฅ 23 days ago
view post
Post
1871
everybody and their dog is fine-tuning Gemma 3 today, so I thought I'd do a longer post on the tips and sharp edges I find. let's go!

1. has to be install everything form main and nightly. this is what I'm working with to get unsloth and TRL running

git+https://github.com/huggingface/transformers@main
git+https://github.com/huggingface/trl.git@main
bitsandbytes
peft


plus this with --no-deps

git+https://github.com/unslothai/unsloth-zoo.git@nightly
git+https://github.com/unslothai/unsloth.git@nightly


2. will brown's code to turn GSM8k into a reasoning dataset is a nice toy experiment https://gist.github.com/willccbb/4676755236bb08cab5f4e54a0475d6fb

3. with a learning rate of 5e-6 rewards and loss stayed flat for the first 100 or so steps.

4. so far none of my runs have undermined the outputs after 1 epoch. therefore, I'm mainly experimenting with bigger LoRA adapters.

from trl import GRPOConfig

training_args = GRPOConfig(
    learning_rate = 5e-6,
    adam_beta1 = 0.9,
    adam_beta2 = 0.99,
    weight_decay = 0.1,
    warmup_ratio = 0.1,
    lr_scheduler_type = "cosine",
    optim = "adamw_8bit",
    logging_steps = 1,
    per_device_train_batch_size = 2,
    gradient_accumulation_steps = 1,
    num_generations = 2,
    max_prompt_length = 256,
    max_completion_length = 1024 - 256,
    num_train_epochs = 1,
    max_steps = 250,
    save_steps = 250,
    max_grad_norm = 0.1,
    report_to = "none",
)


5. vision fine-tuning isn't available in TRL's GRPOTrainer, so stick to text datasets. but no need to load the model differently in transformers or Unsloth

from transformers import AutoModelForImageTextToText

model = AutoModelForImageTextToText.from_pretrained("google/gemma-3-4b-it)


if you want an introduction to GRPO, check out the reasoning course, it walks you through the algorithm, theory, and implementation in a smooth way.

reasoning-course
  • 2 replies
ยท
reacted to fdaudens's post with ๐Ÿ”ฅ 24 days ago
view post
Post
1480
Ever wanted 45 min with one of AIโ€™s most fascinating minds? Was with @thomwolf at HumanX Vegas. Sharing my notes of his Q&A with the pressโ€”completely changed how I think about AIโ€™s future:

1๏ธโƒฃ The next wave of successful AI companies wonโ€™t be defined by who has the best model but by who builds the most useful real-world solutions. "We all have engines in our cars, but thatโ€™s rarely the only reason we buy one. We expect it to work well, and thatโ€™s enough. LLMs will be the same."

2๏ธโƒฃ Big players are pivoting: "Closed-source companiesโ€”OpenAI being the firstโ€”have largely shifted from LLM announcements to product announcements."

3๏ธโƒฃ Open source is changing everything: "DeepSeek was open source AIโ€™s ChatGPT moment. Basically, everyone outside the bubble realized you can get a model for freeโ€”and itโ€™s just as good as the paid ones."

4๏ธโƒฃ Product innovation is being democratized: Take Manus, for exampleโ€”they built a product on top of Anthropicโ€™s models thatโ€™s "actually better than Anthropicโ€™s own product for now, in terms of agents." This proves that anyone can build great products with existing models.

Weโ€™re entering a "multi-LLM world," where models are becoming commoditized, and all the tools to build are readily availableโ€”just look at the flurry of daily new releases on Hugging Face.

Thom's comparison to the internet era is spot-on: "In the beginning you made a lot of money by making websites... but nowadays the huge internet companies are not the companies that built websites. Like Airbnb, Uber, Facebook, they just use the internet as a medium to make something for real life use cases."

Love to hear your thoughts on this shift!
  • 1 reply
ยท
reacted to thomwolf's post with ๐Ÿ”ฅ๐Ÿš€ 24 days ago
view post
Post
2737
We've kept pushing our Open-R1 project, an open initiative to replicate and extend the techniques behind DeepSeek-R1.

And even we were mind-blown by the results we got with this latest model we're releasing: โšก๏ธOlympicCoder ( open-r1/OlympicCoder-7B and open-r1/OlympicCoder-32B)

It's beating Claude 3.7 on (competitive) programming โ€“a domain Anthropic has been historically really strong atโ€“ and it's getting close to o1-mini/R1 on olympiad level coding with just 7B parameters!

And the best part is that we're open-sourcing all about its training dataset, the new IOI benchmark, and more in our Open-R1 progress report #3: https://huggingface.co/blog/open-r1/update-3

Datasets are are releasing:
- open-r1/codeforces
- open-r1/codeforces-cots
- open-r1/ioi
- open-r1/ioi-test-cases
- open-r1/ioi-sample-solutions
- open-r1/ioi-cots
- open-r1/ioi-2024-model-solutions
reacted to clefourrier's post with ๐Ÿš€ 24 days ago
view post
Post
2064
Gemma3 family is out! Reading the tech report, and this section was really interesting to me from a methods/scientific fairness pov.

Instead of doing over-hyped comparisons, they clearly state that **results are reported in a setup which is advantageous to their models**.
(Which everybody does, but people usually don't say)

For a tech report, it makes a lot of sense to report model performance when used optimally!
On leaderboards on the other hand, comparison will be apples to apples, but in a potentially unoptimal way for a given model family (like some user interact sub-optimally with models)

Also contains a cool section (6) on training data memorization rate too! Important to see if your model will output the training data it has seen as such: always an issue for privacy/copyright/... but also very much for evaluation!

Because if your model knows its evals by heart, you're not testing for generalization.
replied to openfree's post 24 days ago
reacted to openfree's post with ๐Ÿค—โค๏ธ๐Ÿ‘€๐Ÿš€๐Ÿ”ฅ 24 days ago
view post
Post
4688
Huggingface Space Leaderboard ๐Ÿš€
Hello Huggingface Community!

VIDraft/Space-Leaderboard

We are excited to introduce the Huggingface Space Leaderboard, a service that lets you view the latest trending Spaces on the Huggingface platform at a glance. This service helps you quickly explore a wide range of creative projects and will spark new inspiration for your own ideas. ๐ŸŽ‰

Detailed Feature Overview

1. Real-time Trend Reflection
Automated Aggregation: Analyzes and ranks over 500 popular Spaces on Huggingface in real time.
Accurate Ranking: Combines various metrics such as likes, engagement, and creation time to accurately reflect the latest trends.
Instant Updates: Data is continuously updated, so you always see the most current popular Spaces.

2. Intuitive Preview
70% Scaled Preview: Each Space is displayed at 70% scale, providing a neat and clear preview at a glance.
Easy Visual Comparison: View multiple Spaces side by side to easily compare their designs and functionalities.
Error Handling: In case of loading issues, a clear error message with a direct link is provided to help resolve any problems.

3. Creator Statistics
Top 30 Creators Analysis: A chart visualizes the number of Spaces created by the most active creators, giving you a clear view of the communityโ€™s top contributors. ๐Ÿ“Š
Data-driven Insights: Analyze the activity trends of each creator to gain fresh insights and inspiration.
Collaboration Opportunities: Use the statistics to easily identify potential collaborators within the community.

Why Choose the Huggingface Space Leaderboard?
๐Ÿš€ Fast and Reliable: Real-time data updates deliver the latest trends instantly, ensuring you gain insights without any delays.
๐Ÿ”Ž Easy Search Functionality: Easily find the Space youโ€™re looking for with filters by name, owner, or tags.
๐Ÿ’ก Intuitive Design: A clean, user-friendly interface makes it simple for anyone to navigate and explore.
  • 1 reply
ยท
reacted to jasoncorkill's post with ๐Ÿ‘€ 24 days ago
view post
Post
2205
Benchmarking Google's Veo2: How Does It Compare?

The results did not meet expectations. Veo2 struggled with style consistency and temporal coherence, falling behind competitors like Runway, Pika, Tencent, and even Alibaba. While the model shows promise, its alignment and quality are not yet there.

Google recently launched Veo2, its latest text-to-video model, through select partners like fal.ai. As part of our ongoing evaluation of state-of-the-art generative video models, we rigorously benchmarked Veo2 against industry leaders.

We generated a large set of Veo2 videos spending hundreds of dollars in the process and systematically evaluated them using our Python-based API for human and automated labeling.

Check out the ranking here: https://www.rapidata.ai/leaderboard/video-models

Rapidata/text-2-video-human-preferences-veo2
reacted to AdinaY's post with ๐Ÿ”ฅ 24 days ago
reacted to AdinaY's post with ๐Ÿ”ฅ 24 days ago
view post
Post
1317
Spark TTS ๐Ÿ”ŠNew OPEN TTS model that can generate any voice with just seconds of audio!

Released by SparkAudio community๐Ÿ”ฅ

Model๐Ÿ‘‰ SparkAudio/Spark-TTS-0.5B
Paper๐Ÿ‘‰ Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens (2503.01710)

โœจ Supports English & Chinese
โœจ BiCodec Speech Codec: Enables precise voice control by separating semantics & speaker attributes