Victor Mustar's picture

Victor Mustar PRO

victor

AI & ML interests

Building the UX of this website

Recent Activity

Organizations

Hugging Face's profile picture Google's profile picture Safetensors's profile picture Competitions's profile picture 21 RNN's profile picture Spaces-explorers's profile picture Text Generation Inference's profile picture Spaces Examples's profile picture CVPR Demo Track's profile picture Hugging Chat's profile picture Webhooks Explorers (BETA)'s profile picture lora concepts library's profile picture Huggingface Projects's profile picture Scanned Tokens's profile picture hf admins's profile picture Hugging Face OSS Metrics's profile picture Stable Diffusion Dreambooth Concepts Library's profile picture Core ML Projects's profile picture temp-org's profile picture Blog-explorers's profile picture Mustarz's profile picture Open LLM Leaderboard's profile picture Enterprise Explorers's profile picture The Collectionists's profile picture ZeroGPU Explorers's profile picture Hugging Face Tools's profile picture TstOrg141's profile picture Stable Video benchmark's profile picture Social Post Explorers's profile picture Dev Mode Explorers's profile picture LLHF's profile picture SLLHF's profile picture Self-serve FTW's profile picture Inference Explorers's profile picture

victor's activity

reacted to burtenshaw's post with 🚀 1 day ago
view post
Post
5202
AGENTS + FINETUNING! This week Hugging Face learn has a whole pathway on finetuning for agentic applications. You can follow these two courses to get knowledge on levelling up your agent game beyond prompts:

1️⃣ New Supervised Fine-tuning unit in the NLP Course https://huggingface.co/learn/nlp-course/en/chapter11/1
2️⃣New Finetuning for agents bonus module in the Agents Course https://huggingface.co/learn/agents-course/bonus-unit1/introduction

Fine-tuning will squeeze everything out of your model for how you’re using it, more than any prompt.
  • 2 replies
·
reacted to fdaudens's post with ❤️ 1 day ago
reacted to clem's post with 👍 1 day ago
view post
Post
2485
What are the best organizations to follow on @huggingface ?

On top of my head:
- Deepseek (35,000 followers): https://huggingface.co/deepseek-ai
- Meta Llama (27,000 followers): https://huggingface.co/meta-llama
- Black Forrest Labs (11,000 followers): https://huggingface.co/black-forest-labs
- OpenAI (5,000 followers): https://huggingface.co/openai
- Nvidia (16,000 followers): https://huggingface.co/nvidia
- MIcrosoft (9,000 followers): https://huggingface.co/microsoft
- AllenAI (2,000 followers): https://huggingface.co/allenai
- Mistral (5,000 followers): https://huggingface.co/mistralai
- XAI (600 followers): https://huggingface.co/xai-org
- Stability AI (16,000 followers): https://huggingface.co/stabilityai
- Qwen (16,000 followers): https://huggingface.co/Qwen
- GoogleAI (8,000 followers): https://huggingface.co/google
- Unsloth (3,000 followers): https://huggingface.co/unsloth
- Bria AI (4,000 followers): https://huggingface.co/briaai
- NousResearch (1,300 followers): https://huggingface.co/NousResearch

Bonus, the agent course org with 17,000 followers: https://huggingface.co/agents-course
  • 1 reply
·
replied to AdinaY's post 2 days ago
reacted to AdinaY's post with ❤️ 2 days ago
view post
Post
4056
🚀 StepFun阶跃星辰 is making BIG open moves!

Last year, their GOT-OCR 2.0 took the community by storm 🔥but many didn’t know they were also building some amazing models. Now, they’ve just dropped something huge on the hub!

📺 Step-Video-T2V: a 30B bilingual open video model that generates 204 frames (8-10s) at 540P resolution with high information density & consistency.
stepfun-ai/stepvideo-t2v

🔊 Step-Audio-TTS-3B : a TTS trained with the LLM-Chat paradigm on a large synthetic dataset, capable of generating RAP & Humming
stepfun-ai/step-audio-67b33accf45735bb21131b0b
·
reacted to nicolay-r's post with 👀 3 days ago
view post
Post
2309
📢 For those who start to work with LLM streaming in web, here is a minimalistic example in JS for accessing server hosted by FastAPI via REST:
https://gist.github.com/nicolay-r/840425749cf6d3e397da3d329e894d59

The code above is a revised verison for accessing Replicate API posted earlier
https://huggingface.co/posts/nicolay-r/390307941200307

The key difference from Replicate API:
- using only POST for passing a body with parameters and fetching the reader.
reacted to clem's post with 🔥 3 days ago
view post
Post
3248
We crossed 1B+ tokens routed to inference providers partners on HF, that we released just a few days ago.

Just getting started of course but early users seem to like it & always happy to be able to partner with cool startups in the ecosystem.

Have you been using any integration and how can we make it better?

https://huggingface.co/blog/inference-providers
reacted to sayakpaul's post with 🔥 3 days ago
view post
Post
2698
Inference-time scaling meets Flux.1-Dev (and others) 🔥

Presenting a simple re-implementation of "Inference-time scaling diffusion models beyond denoising steps" by Ma et al.

I did the simplest random search strategy, but results can potentially be improved with better-guided search methods.

Supports Gemini 2 Flash & Qwen2.5 as verifiers for "LLMGrading" 🤗

The steps are simple:

For each round:

1> Starting by sampling 2 starting noises with different seeds.
2> Score the generations w.r.t a metric.
3> Obtain the best generation from the current round.

If you have more compute budget, go to the next search round. Scale the noise pool (2 ** search_round) and repeat 1 - 3.

This constitutes the random search method as done in the paper by Google DeepMind.

Code, more results, and a bunch of other stuff are in the repository. Check it out here: https://github.com/sayakpaul/tt-scale-flux/ 🤗
reacted to Quazim0t0's post with 👍 3 days ago
view post
Post
2284
My first attempt at using SmolAgents:
Quazim0t0/CSVAgent

The video attached was an example for this space.

Based on ZennyKenny's SqlAgent:
ZennyKenny/sqlAgent

You can upload a CSV file and it will automatically populate the table, then you can ask questions about the data.

Grab a sample CSV file here: https://github.com/datablist/sample-csv-files

The questions that can be asked may be limited.

_______________________
Second: Quazim0t0/TXTAgent
Created an Agent that converts a .txt file into a CSV file, then you can ask about the data and also download the CSV file that was generated.

_______________________
Third: Quazim0t0/ReportAgent
Upload Multiple TXT/DOC files to then generate a report from those files.

_______________________
Lastly: Quazim0t0/qResearch
A Research tool that uses DuckDuckGo for Web Searches, Wikipedia and tries to refine the answers in MLA Format.

reacted to prithivMLmods's post with 🚀 3 days ago
view post
Post
4400
The last week of Impression Craft Arts and sketches from strangerzonehf🎨🧑🏻‍🎨

- Collection : strangerzonehf/Flux-Ultimate-LoRA-Collection

Adapters:
+ Ld-Art : strangerzonehf/Ld-Art
+ Animeopix-Flux : strangerzonehf/Animeopix-Flux
+ Flux-Super-Paint-LoRA : strangerzonehf/Flux-Super-Paint-LoRA
+ CinematicShot-Pics-Flux : strangerzonehf/cinematicShot-Pics-Flux
+ Oil-Wall-Art-Flux : strangerzonehf/Oil-Wall-Art-Flux
+ Pixelo-Flux : strangerzonehf/Pixelo-Flux
+ Abstract-Shattered : strangerzonehf/Abstract-Shattered
+ Neon-Impressionism-Flux : strangerzonehf/Neon-Impressionism-Flux
+ NewG-Art : strangerzonehf/NewG-Art

🪧Demo : prithivMLmods/FLUX-LoRA-DLC
🤗Page : https://huggingface.co/strangerzonehf
reacted to ZennyKenny's post with 👍 4 days ago
view post
Post
1936
Okay this is pretty crazy. Snowflake has CortexAI and Uber is already teasing QueryGPT, both of which prominently feature plain text to SQL features to query your database.

I decided to see how hard it would be to put together something similar using 🤗 smolagents. Turns out, it was pretty straightforward. I managed to get it done in London Luton airport this afternoon.

ZennyKenny/sqlAgent
  • 2 replies
·
reacted to Jaward's post with 👀 4 days ago
view post
Post
3755
Finally here it is: a faster, custom, scalable GRPO trainer for smaller models with < 500M params, can train on 8gb ram cpu, also supports gpu for sanity sake (includes support for vllm + flash attention). Using smolLM2-135M/360M-instructs as ref & base models. Experience your own “aha” moment 🐳 on 8gb ram.
Code: https://github.com/Jaykef/ai-algorithms/blob/main/smollm2_360M_135M_grpo_gsm8k.ipynb
  • 2 replies
·
reacted to Akjava's post with 🔥 4 days ago
reacted to schuler's post with 😎 4 days ago
view post
Post
3329
🔮 GPT-3 implemented in pure Free Pascal!
https://github.com/joaopauloschuler/gpt-3-for-pascal

This implementation follows the GPT-3 Small architecture from the landmark paper "Language Models are Few-Shot Learners":
┌─────────────────────────┐
│     Input Layer       │
├─────────────────────────┤
│ Token & Positional    │
│     Embedding         │
├─────────────────────────┤
│   12x Transformer     │
│      Blocks           │
│  - 12 heads           │
│  - 768 hidden dims    │
│  - 3072 intermediate  │
├─────────────────────────┤
│   Output Layer        │
└─────────────────────────┘

Clean Pascal Implementation
for CntLayer := 1 to {Layers=}12 do
begin
  Result.AddTransformerBlockCAI(
    {Heads=}12, 
    {intermediate dimensions=}4*768, 
    {NoForward=}true, 
    {HasNorm=}true, 
    false
  );
end;

reacted to onekq's post with 👍 7 days ago
view post
Post
1752
R1 is still trending. Here is a collection of works trying to replicate R1.
onekq-ai/r1-reproduction-works-67a93f2fb8b21202c9eedf0b

Players include Huggingface (Open R1), Stanford (simple scaling), Berkeley (Bespoke, Open thoughts, etc.), ServiceNow, etc. I know there is another work from HKUST but couldn't find it on 🤗. Let me know if I miss any teams.
  • 5 replies
·
reacted to AdinaY's post with 🚀 7 days ago
view post
Post
2514
Ovis2 🔥 a multimodal LLM released by Alibaba AIDC team.
AIDC-AI/ovis2-67ab36c7e497429034874464
✨1B/2B/4B/8B/16B/34B
✨Strong CoT for deeper problem solving
✨Multilingual OCR – Expanded beyond English & Chinese, with better data extraction
reacted to mrzjy's post with 👀 7 days ago
view post
Post
1266
A very small project:

Introducing CreativeTinyZero:
mrzjy/Qwen2.5-1.5B-GRPO-Creative-Ad-Generation

Unlike the impressive DeepSeek-R1(-Zero), this project focuses on a pure reinforcement learning (RL) experiment applied to an open-domain task: creative advertisement generation.

Objective:

- To investigate the feasibility of applying R1-like methods to an open-domain task without a verifiable ground-truth reward, while at least demonstrating its potential.
- To explore whether <think> and <answer> rewards can be explicitly designed to provide strong guidance through RL based on human prior knowledge.

Note:
- Our goal is not to induce self-reflective thinking, but to align with human thought processes purely through RL, without any supervised fine-tuning (SFT) on any constructed dataset.

Despite its small size, the resulting 1.5B-GRPO model demonstrates intriguing generative capabilities—though it's still far from perfect.
  • 1 reply
·
reacted to Duskfallcrew's post with 👍 7 days ago
view post
Post
1417
I don't have the stamina to port my articles tonight, i've been dealign with my CPTSD seizures again - but here's a fun update over on Bluesky!
https://bsky.app/profile/duskfallcrew.bsky.social/post/3li4zwdhy5c2q
HF's been my open source home since before i got on Civitai, and while i've largely left Civitai, i can't leave AI yet.
SO if y'all don't mind me trying to rebuild my "empire" one nerd block at a time, i'll keep my content easily accessible, :)

OH PSST New AI/ML Discord i made recently:
(It's also a shill for my main twitch/media/music hub)

Join us on this journey. Welcome to Ktiseos Nyx.

Our Discord:
https://discord.gg/HhBSvM9gBY

Earth & Dusk Media
https://discord.gg/5t2kYxt7An

:3 Cant' wait to hang out, and i've always linked back to HF for my E&D. content in terms of my lora backups and checkpoints!

Y'all who make diffusers versions of my content:
YOU ROCK. Do me a smidge favor: :3 aside from linking back can you maaaaaybe add the new K/N discord on there?

it's my geeky new AI safe space. XD
Also yea, if you've watched the new Beetlejuice movie, you know that i will never quit the ectoplasmic nerd train XD
  • 1 reply
·
reacted to as-cle-bert's post with 🚀 7 days ago
view post
Post
1360
𝐒𝐜𝐢𝐍𝐞𝐰𝐬𝐁𝐨𝐭 - 𝐑𝐞𝐩𝐨𝐫𝐭 𝐝𝐚𝐢𝐥𝐲 𝐒𝐜𝐢𝐞𝐧𝐜𝐞 𝐧𝐞𝐰𝐬 𝐨𝐧 𝐁𝐥𝐮𝐞𝐒𝐤𝐲

GitHub 👉 https://github.com/AstraBert/SciNewsBot
BlueSky 👉 https://bsky.app/profile/sci-news-bot.bsky.social

Hi there HF Community!🤗
I just created a very simple AI-powered bot that shares fact-checked news about Science, Environment, Energy and Technology on BlueSky :)

The bot takes news from Google News, filters out the sources that are not represented in the Media Bias Fact Check database, and then evaluates the reliability of the source based on the MBFC metrics. After that, it creates a catchy headline for the article and publishes the post on BlueSky📰

The cool thing? SciNewsBot is open-source and is cheap to maintain, as it is based on mistralai/Mistral-Small-24B-Instruct-2501 (via Mistral API). You can reproduce it locally, spinning it up on your machine, and even launch it on cloud through a comfy Docker setup🐋

Have fun and spread Science!✨
reacted to m-ric's post with 🚀 7 days ago
view post
Post
2319
For those who haven't come across it yet, here's a handy trick to discuss an entire GitHub repo with an LLM:

=> Just replace "github" with "gitingest" in the url, and you get the whole repo as a single string that you can then paste in your LLMs