AI & ML interests

Aligning LLMs to be helpful, honest, harmless, and huggy (H4)

Recent Activity

sergiopaniego 
posted an update about 4 hours ago
view post
Post
28
Nemotron 3 Super by @nvidia is here! NVIDIA's hybrid Mamba2/Transformer models are now natively supported in transformers (no trust_remote_code needed)

Fine-tune them with TRL in just a few lines of code. Notebook + script included to get started right away. goooo!

- Notebook: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_nemotron_3.ipynb
- Script: https://github.com/huggingface/trl/blob/main/examples/scripts/sft_nemotron_3.py
- Collection with all the models: https://huggingface.co/collections/nvidia/nvidia-nemotron-v3
alvarobartt 
posted an update 6 days ago
view post
Post
3170
Learn how to deploy Microsoft Research VibeVoice ASR on Microsoft Azure Foundry with Hugging Face to generate rich audio transcriptions with Who, When, and What! 💥

> 🕒 60-minute single-pass processing, no chunking or stitching
> 👤 Customized hotwords to guide recognition on domain-specific content
> 📝 Rich transcription: joint ASR + diarization + timestamping in one pass
> 🌍 50+ languages with automatic detection and code-switching support
> 🤗 Deployed on Microsoft Foundry via an OpenAI-compatible Chat Completions API

https://huggingface.co/docs/microsoft-azure/foundry/examples/deploy-vibevoice-asr
sergiopaniego 
posted an update 8 days ago
view post
Post
448
did you know you can train agentic models with RL deploying the environments on HF Spaces? 🤗

with TRL + OpenEnv, your training script connects to remote environments hosted as Spaces

want to train faster? → just add more Spaces (TRL handles the parallelization natively)

we used this to train a model to solve the trolley problem in CARLA. 2 HF Spaces running a full driving simulator, each on a T4 GPU

full write-up with code and results → https://huggingface.co/blog/sergiopaniego/bringing-carla-to-openenv-trl
sergiopaniego 
posted an update 9 days ago
sergiopaniego 
posted an update 13 days ago
view post
Post
2321
What happens when you make an LLM drive a car where physics are real and actions can't be undone?

I ported CARLA, the autonomous driving simulator, to OpenEnv and added training support via TRL + Hugging Face Spaces.

The model interacts with the simulator through tool calls (observe, brake, change lane) and learns from a reward signal.

In 50 training steps, Qwen 0.6B learns to swerve and brake to avoid pedestrians in emergency situations.

The project supports text and vision (VLMs can see through a camera sensor), open-world driving with traffic, and multiple driving scenarios.

This builds on the carla-env project by sinatras, which originally placed LLMs inside CARLA for evaluation. We extended it with vision, new scenarios, rubric-based rewards, and made it trainable end-to-end.

Blog: https://huggingface.co/blog/sergiopaniego/bringing-carla-to-openenv-trl/
CARLA env in OpenEnv: https://github.com/meta-pytorch/OpenEnv/tree/main/envs/carla_env
Training script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/carla.py
albertvillanova 
posted an update 13 days ago
view post
Post
1839
🚀 TRL v0.29.0 introduces trl-training: an agent-native training skill.

This makes the TRL CLI a structured, agent-readable capability, allowing AI agents to reliably execute training workflows such as:
- Supervised Fine-Tuning (SFT)
- Direct Preference Optimization (DPO)
- Group Relative Policy Optimization (GRPO)

We’re excited to see what the community builds on top of this.

If you’re working on AI agents, alignment research, or scalable RL training infrastructure: give TRL v0.29.0 a try! 🤗

The future of ML tooling is agent-native.
🔗 https://github.com/huggingface/trl/releases/tag/v0.29.0
qgallouedec 
posted an update 20 days ago
view post
Post
2725
@CohereLabs just released 🌿 Tiny Aya: a fully open-source 3B parameter model that speaks 70+ languages 🌍! But there’s a catch:

Tiny Aya is just a language model. It doesn’t support tool calling, the key capability that turns frontier models into powerful *agents*.
So the real question is:

How hard is it to turn Tiny Aya into an agent?

Turns out… it’s simple, thanks to Hugging Face TRL.
We’re sharing a hands-on example showing how to train Tiny Aya to turn it into a tool-calling agent using TRL, unlocking what could become the first *massively multilingual open agent*.

Small model. Global reach. Agent capabilities.

👉 https://github.com/huggingface/trl/blob/main/examples/notebooks/sft_tool_calling.ipynb
  • 1 reply
·
sergiopaniego 
posted an update 21 days ago
sergiopaniego 
posted an update 26 days ago
albertvillanova 
posted an update 28 days ago
view post
Post
1746
5 years already working in democratizing AI 🤗
Grateful to be part of such an awesome team making it happen every day.
sergiopaniego 
posted an update about 1 month ago
view post
Post
493
if you're looking for a good first issue to get your open-source journey started, you could contribute to this TRL issue by documenting one impactful paper in the docs

we have a broad list to cover!! 🧐

https://github.com/huggingface/trl/issues/4407