Quazimoto's picture

Quazimoto PRO

Quazim0t0

AI & ML interests

the hunchback of huggingface ๐Ÿ”™ joined: 1-20-2025 ๐Ÿฆฅunsloth user 4๏ธโƒฃ Phi User ๐Ÿ”จ ai hobbyist ๐Ÿ“ซ On Leaderboards Top 100-200

Recent Activity

updated a model about 2 hours ago
Quazim0t0/Geedorah-14B
updated a model about 2 hours ago
Quazim0t0/Lineage-14B
updated a model about 2 hours ago
Quazim0t0/mocha-14B
View all activity

Organizations

Seance Table's profile picture

Quazim0t0's activity

reacted to onekq's post with ๐Ÿ‘ 1 day ago
view post
Post
707
A bigger and harder pain point for reasoning model is to switch modes.

We now have powerful models capable of either system I thinking or system II thinking, but not both, much less switching between the two. But humans can do this quite easily.

ChatGPT and others push the burden to users to switch between models. I guess this is the best we have now.
  • 2 replies
ยท
reacted to AdinaY's post with ๐Ÿ”ฅ 1 day ago
reacted to thomwolf's post with ๐Ÿš€ 1 day ago
view post
Post
1140
We've kept pushing our Open-R1 project, an open initiative to replicate and extend the techniques behind DeepSeek-R1.

And even we were mind-blown by the results we got with this latest model we're releasing: โšก๏ธOlympicCoder ( open-r1/OlympicCoder-7B and open-r1/OlympicCoder-32B)

It's beating Claude 3.7 on (competitive) programming โ€“a domain Anthropic has been historically really strong atโ€“ and it's getting close to o1-mini/R1 on olympiad level coding with just 7B parameters!

And the best part is that we're open-sourcing all about its training dataset, the new IOI benchmark, and more in our Open-R1 progress report #3: https://huggingface.co/blog/open-r1/update-3

Datasets are are releasing:
- open-r1/codeforces
- open-r1/codeforces-cots
- open-r1/ioi
- open-r1/ioi-test-cases
- open-r1/ioi-sample-solutions
- open-r1/ioi-cots
- open-r1/ioi-2024-model-solutions
reacted to Lunzima's post with ๐Ÿš€ 1 day ago
view post
Post
696
I'm currently experimenting with the SFT dataset Lunzima/alpaca_like_dataset to further boost the performance of NQLSG-Qwen2.5-14B-MegaFusion-v9.x. This includes data sourced from DeepSeek-R1 or other cleaned results (excluding CoTs). Additionally, datasets that could potentially enhance the model's performance in math and programming/code, as well as those dedicated to specific uses like Swahili, are part of the mix.
@sometimesanotion @sthenno @wanlige
  • 1 reply
ยท
reacted to awacke1's post with ๐Ÿš€ 2 days ago
view post
Post
1959
I introduce MIT license

ML Model Specialize Fine Tuner app "SFT Tiny Titans" ๐Ÿš€

Demo video with source.

Download, train, SFT, and test your models, easy as 1-2-3!
URL: awacke1/TorchTransformers-NLP-CV-SFT
  • 2 replies
ยท
reacted to BrigitteTousi's post with ๐Ÿš€ 2 days ago
reacted to sandhawalia's post with ๐Ÿ”ฅ 2 days ago
view post
Post
1729
LeRobot goes to driving school. World's largest open-source self driving dataset. Ready for end-to-end learning with LeRobot.

3 years, 30 German cities, 60 driving instructors and students. https://huggingface.co/blog/lerobot-goes-to-driving-school

Coming this summer โ€” LeRobot driver.
reacted to fdaudens's post with ๐Ÿค— 3 days ago
view post
Post
5642
Honored to be named among their 12 pioneers and power players in the news industry in the 2025 Tech Trends Report from Future Today Strategy Group.

Incredible group to be part of - each person is doing groundbreaking work at the intersection of AI and journalism. Worth following them all: they're consistently sharing practical insights on building the future of news.

Take the time to read this report, it's packed with insights as always. The news & information section's #1 insight hits hard: "The most substantive economic impact of AI to date has been licensing payouts for a handful of big publishers. The competition will start shifting in the year ahead to separate AI 'haves' that have positioned themselves to grow from the 'have-nots.'"

This AI-driven divide is something I've been really concerned about. Now is the time to build more than ever!

๐Ÿ‘‰ Full report here: https://ftsg.com/wp-content/uploads/2025/03/FTSG_2025_TR_FINAL_LINKED.pdf
  • 2 replies
ยท
replied to their post 4 days ago
posted an update 4 days ago
view post
Post
520
Update to the Imagine side-project.
Just uploaded the 16Bit & Q4

Samples: (Used a base Microsoft Phi4 model)
*You may experience bugs with either the model or the Open WebUI function*
Open WebUI function: https://openwebui.com/f/quaz93/imagine_phi
Quazim0t0/Imagine-v0.5-16bit - Haven't tested
Quazim0t0/ImagineTest-v0.5-GGUF - Tested (Pictures)

Dataset: Quazim0t0/Amanita-Imagine
Small Dataset of 500+ entries, still working on it here and there when I can.
Pictures use the Open Web UI function I provided.
  • 1 reply
ยท
replied to their post 11 days ago
view reply

I'll try to get that out for you when I get a chance.

reacted to AdinaY's post with ๐Ÿ”ฅ๐Ÿš€ 11 days ago
reacted to ZennyKenny's post with ๐Ÿ‘ 12 days ago
view post
Post
1873
I've spent most of time working with AI on user-facing apps like Chatbots and TextGen, but today I decided to work on something that I think has a lot of applications for Data Science teams: ZennyKenny/comment_classification

This Space supports uploading a user CSV and categorizing the fields based on user-defined categories. The applications of AI in production are truly endless. ๐Ÿš€
posted an update 12 days ago
view post
Post
2234
Debugging Tags:
Imagine, Associated Thoughts, Dialectical Analysis, Backwards Induction, Metacognition, and Normal Thought Processes such as <think> or <begin_of_thought>

Edit: Uploaded new images w/ a Open WebUI function to organize the tags.
Open WebUI Function: https://openwebui.com/f/quaz93/imagine_phi

This Phi-4 model is part of a test project that I called Micro-Dose. My goal was to use a small dataset to activate reasoning and other cognitive processes without relying on a large dataset.

I found that this was possible with a tiny dataset of just 90 rows, specifically designed as math problems. In the initial iterations, the dataset only activated reasoning when a math-related question was asked. I then made a few changes to the datasetโ€™s structure, including the order of information and the naming of tags. You can see the sample results in the pictures. Not really anything special, just thought I'd share.

Tweaked the dataset a bit:
Quazim0t0/Imagine-Phi-v0.2-GGUF
Quazim0t0/MicroDoseV0.2


First image shows the new tags, second shows the regular thought process and the third is the model in combination with web searches
ย 
  • 2 replies
ยท
reacted to lingvanex-mt's post with ๐Ÿ”ฅ๐Ÿ‘ 12 days ago
view post
Post
3495
Dear HF Community!

Our company open-sourced machine translation models for 12 rare languages under MIT license.

You can use them freely with OpenNMT translation framework. Each model is about 110 mb and has an excellent performance, ( about 40000 characters / s on Nvidia RTX 3090 )

Download models there

https://huggingface.co/lingvanex

You can test translation quality there:

https://lingvanex.com/translate/
reacted to caelancooper's post with ๐Ÿ‘ 19 days ago
view post
Post
954
Hey Huggingface Community,

I'm just starting my journey. I'm here to learn and contribute as much as I can to the AI community. What happened with one of my models was I left the security permissions open for people to commit changes and contribute to the model in good faith and the opposite happened.

I'm open to all feedback you may have on my future projects. Let's keep it collegial and try to make something amazing. I always stride to make situations a win for all parties involved and would love to collaborate with anybody who's interested in innovation, optimization and new use cases for AI.

Thanks Everyone,
Caelan
posted an update 23 days ago
view post
Post
2373
My first attempt at using SmolAgents:
Quazim0t0/CSVAgent

The video attached was an example for this space.

Based on ZennyKenny's SqlAgent:
ZennyKenny/sqlAgent

You can upload a CSV file and it will automatically populate the table, then you can ask questions about the data.

Grab a sample CSV file here: https://github.com/datablist/sample-csv-files

The questions that can be asked may be limited.

_______________________
Second: Quazim0t0/TXTAgent
Created an Agent that converts a .txt file into a CSV file, then you can ask about the data and also download the CSV file that was generated.

_______________________
Third: Quazim0t0/ReportAgent
Upload Multiple TXT/DOC files to then generate a report from those files.

_______________________
Lastly: Quazim0t0/qResearch
A Research tool that uses DuckDuckGo for Web Searches, Wikipedia and tries to refine the answers in MLA Format.

reacted to Jaward's post with ๐Ÿ”ฅ 24 days ago
view post
Post
3867
Finally here it is: a faster, custom, scalable GRPO trainer for smaller models with < 500M params, can train on 8gb ram cpu, also supports gpu for sanity sake (includes support for vllm + flash attention). Using smolLM2-135M/360M-instructs as ref & base models. Experience your own โ€œahaโ€ moment ๐Ÿณ on 8gb ram.
Code: https://github.com/Jaykef/ai-algorithms/blob/main/smollm2_360M_135M_grpo_gsm8k.ipynb
  • 2 replies
ยท