victor (Victor Mustar)

reacted to nyuuzyou's post with 👍 about 3 hours ago

Post

956

I am planning to release *something big* this week, but in the meantime I was bored, so I quickly made a small dataset in as-is format.

📱 Sponsr.ru Dataset - nyuuzyou/sponsr

Collection of 44,138 posts from Sponsr.ru, a Russian content subscription platform featuring:
- Comprehensive metadata including project details, post information, and pricing
- Detailed content categorization with images, videos, and text formats
- Monolingual Russian content from diverse creator projects

reacted to csabakecskemeti's post with 👍 about 20 hours ago

Post

2635

I'm collecting llama-bench results for inference with a llama 3.1 8B q4 and q8 reference models on varoius GPUs. The results are average of 5 executions.
The system varies (different motherboard and CPU ... but that probably that has little effect on the inference performance).

https://devquasar.com/gpu-gguf-inference-comparison/
the exact models user are in the page

I'd welcome results from other GPUs is you have access do anything else you've need in the post. Hopefully this is useful information everyone.

reacted to onekq's post with 👍 about 20 hours ago

Post

2619

I shared my view on Qwen vs DeepSeek (student vs genius), and I forgot to mention this: they are neighbors in the same city.
https://en.wikipedia.org/wiki/Hangzhou

reacted to smirki's post with 👍 about 20 hours ago

Post

1453

Introducing a SMALL Reasoning React Model with State!
We did this by introducing a new form of reasoning that aligns with UI principles to do a layer of testing. For example:
"Looking back at all these pieces, we've considered state management, data structures, core functionalities etc"
And it comes in all sizes. Great for agents!
Tesslate/tessa-t1-react-reasoning-model-67e0fb72ca23e04473885c0e
Tesslate/Tessa-T1-14B
https://huggingface.co/smirki/Tessa-T1-14B-Q8_0-GGUF

reacted to MikeDoes's post with 🔥 about 20 hours ago

Post

1786

🌟 Day 4: Two Models, One Privacy Mission! 🌟

The PII-Masking-1M series rolls on with two gems:

Categorical: ai4privacy/llama-ai4privacy-multilingual-categorical-anonymiser-openpii
Redaction: ai4privacy/llama-ai4privacy-multilingual-anonymiser-openpii
Join us in protecting data everywhere!

#AI #Privacy #OpenSource #Multilingual

reacted to etemiz's post with 👀 5 days ago

Post

1657

Started fine tuning Gemma 3 using evolutionary approach. It is not the worst model according to AHA leaderboard and it is one of the smart according to lmarena.ai. My objective is to make it based, anti woke, wise, beneficial and then some.

Several GPUs are fine tuning it at the same time, each using a different dataset and using QLoRA and the successful ones are merged later. Compared to LoRa this allows faster training and also reduced overfitting because the merge operation heals overfitting. The problem with this could be the 4 bit quantization may make models dumber. But I am not looking for sheer IQ. Too much mind is a problem anyway :)

Has anyone tried parallel QLoRa and merge before?

I also automated the dataset selection and benchmarking and converging to objectives (the fit function, the reward). It is basically trying to get higher score in AHA Leaderboard as fast as possible with a diverse set of organisms that "evolve by training".

I want to release some cool stuff when I have the time:
- how an answer to a single question changes over time, with each training round or day
- a chart to show AHA alignment over training rounds

3 replies

·

reacted to chansung's post with ❤️ 5 days ago

Post

2434

Mistral AI Small 3.1 24B is not only commercial free but also the best model in a single GPU deployment.

I packed up all the information you need to know in a single picture. Hope this helps! :)

1 reply

·

reacted to MohamedRashad's post with 👀 5 days ago

Post

1855

For those interested in trying the new canopylabs/orpheus-3b-0.1-ft model i made a space for you:

MohamedRashad/Orpheus-TTS

reacted to sharpenb's post with 🔥 5 days ago

Post

3009

We open-sourced the pruna package that can be easily installed with pip install pruna :) It allows to easily ccompress and evaluate AI models including transformers and diffusers.

- Github repo: https://github.com/PrunaAI/pruna
- Documentation: https://docs.pruna.ai/en/stable/index.html

With open-sourcing, people can now inspect and contribute to the open code. Beyond the code, we provide detailed readme, tutorials, benchmarks, and documentation to make transparent compression, evaluation, and saving/loading/serving of AI models.

Happy to share it with you and always interested in collecting your feedback :)

1 reply

·

reacted to daavoo's post with 🔥 5 days ago

Post

1965

🤖 🗺Mapped all(?) the swimming pools ️🏊 around another town with https://github.com/mozilla-ai/osm-ai-helper.

This time, I have mapped and contributed to https://www.openstreetmap.org more than 100 swimming pools around my wife's hometown. Only took about 20min to find them all (+~3 min verification) in a free Colab GPU🚀

Try it yourself around a single point: mozilla-ai/osm-ai-helper

reacted to clem's post with 🔥 5 days ago

Post

2428

Nice new space to see how fast your personal or organization followers are growing on HF:
julien-c/follow-history

As you can see, I still have more followers than @julien-c even if he's trying to change this by building such cool spaces 😝😝😝

reacted to csabakecskemeti's post with 😎 6 days ago

Post

1772

GTC new model announcement now from Nvidia
nvidia/Llama-3_3-Nemotron-Super-49B-v1

GGUFs:
DevQuasar/nvidia.Llama-3_3-Nemotron-Super-49B-v1-GGUF

Enjoy!

reacted to MikeDoes's post with 👀 6 days ago

Post

2058

#PII Masking Tech that does not **** around!

We are happy to release the OpenPII English Anonymiser —the most powerful open-source tool for redacting sensitive info from English text.

Fine-tuned Modernbert on 5.7 million+ PII examples, it’s clocking 99%+ accuracy across emails, dates, social numbers, and more!

Why it’s a big deal:
✅ Top-tier precision: 100% for passport numbers, 99.96% for emails*.
✅ Totally free: MIT license for personal or commercial use.
✅ No secrets: Full metrics shared on Hugging Face.

#AI #OpenSource #DataSecurity @huggingface

Day 2 out 7 of PII-Masking-1M Announcements Complete!

*Accuracies reported from the new OpenPII-500k dataset

ai4privacy/llama-ai4privacy-english-anonymiser-openpii

reacted to AdinaY's post with 🔥 6 days ago

Post

2833

RWKV7-G1 0.1B 🔥 Pure RNN reasoning model released by RWKV

Model: BlinkDL/rwkv7-g1
paper: RWKV-7 "Goose" with Expressive Dynamic State Evolution (2503.14456)

✨ Apache2.0
✨ Supports 100+ languages
✨ 0.1 B runs smoothly on low power devices
✨ 0.4B/1.5B/2.9B are coming soon!!

1 reply

·

reacted to Jaward's post with 🚀 6 days ago

Post

2060

Nvidia brings blue (from starwars droids) to life 🤯, supercute with flawless dexterity and droid voice. It's the result of their colab research with Google DeepMind and Disney, revealed as part of their new opensource physics engine for robotics simulation: NEWTON - which enables robots to learn how to complete complex tasks with greater precision.

ReadMore: https://developer.nvidia.com/blog/announcing-newton-an-open-source-physics-engine-for-robotics-simulation?ncid=so-twit-820797-vt48

reacted to mrfakename's post with 👍 6 days ago

Post

2352

GGUF quants (text-only) for the new Mistral Small 3.1 24B are now live:

mrfakename/mistral-small-3.1-24b-instruct-2503-gguf

reacted to etemiz's post with 👍 7 days ago

Post

2795

My 1 year of work summarized.

TLDR: by carefully curating datasets we can fix misinformation in AI. Then we can use that to measure misinformation in other AI.

https://huggingface.co/blog/etemiz/building-a-beneficial-ai

replied to onekq's post 7 days ago

Boom! cc @cfahlgren1

reacted to onekq's post with 🚀 7 days ago

Post

2248

Introducing 🎉 OneSQL-v0.1🥳, our first text-to-SQL model based on Qwen2.5-Coder. This model has achieved an EX score of 63.33 on the BIRD leaderboard (https://bird-bench.github.io/).

The model family includes 7B and 32B,
onekq-ai/onesql-v01-qwen-67d8e3eb1611c5532bb90c5f
and can be also found on Ollama (https://ollama.com/onekq/OneSQL-v0.1-Qwen)

My goal is to make OneSQL the most usable open-weights model for text-to-SQL. I'm currently working on best practices to help users use this model the right away and avoid pitfalls. After that, I plan to train the next version to push for a higher EX score.

Enjoy this model and feel free to share comments/questions 🤗

1 reply

·

reacted to AdinaY's post with 🚀 7 days ago

Post

2649

New 3D models from Tencent Hunyuan are now available on the hub 🔥

✨ Hunyuan3D-2mv: multiview shape model for high quality generation
✨ Hunyuan3D-2mini: 0.6B lightweight model for efficient workflows

Model:
tencent/Hunyuan3D-2mv
tencent/Hunyuan3D-2mini
Demo:
tencent/Hunyuan3D-2mv

1 reply

·

Victor Mustar PRO

AI & ML interests

Recent Activity

Organizations

victor's activity