mradermacher/model_requests · https://huggingface.co/huihui-ai/DeepSeek-V3-abliterated

13 days ago

https://huggingface.co/huihui-ai/DeepSeek-V3-abliterated

There's a good chance this model could top the ugi-leaderboard since base DeepSeek-V3 already does really well. Though if you don't really want to quant another 671B model that's completely understandable.

nicoboss

13 days ago

•

edited 13 days ago

This model is not even released yet and the team behind it is holding it ransome for bitcoin payment and likes. This all seems like a big scam. For 5% of the $6000 they demand I could easily uncensor it myself properly using an uncensored dataset instead of that cheap alliteration technique they used. I did so for multible 405B models in the past and provided them for free to the community. If they ever release it we will be more than happy to quant it so please let me know once they do.

DontPlanToEnd

13 days ago

Oh. Dang it, sorry. I should really remember to check the model files before submitting a model. I'll notify you if they ever do end up releasing it.

darkc0de

12 days ago

huihui-ai has pumped out countless abliterated models in pretty timely fashion in my experience. Every time I requested a model my wish was granted in a matter of days, granted those were in the 10B-32B range.

nicoboss you have any suggestions on uncensored datasets. I've used toxicdpo/amoralqa but looking for more datasets specifically to score a high W10 on UGI

nicoboss

12 days ago

•

edited 12 days ago

huihui-ai has pumped out countless abliterated models in pretty timely fashion in my experience. Every time I requested a model my wish was granted in a matter of days, granted those were in the 10B-32B range.

Great to know. So I just got unlucky and got a terrible first impression from them by only looking at this specific model. I now see that other than the shit they try to pull for that one they are indeed doing amazing work.

nicoboss you have any suggestions on uncensored datasets. I've used toxicdpo/amoralqa but looking for more datasets specifically to score a high W10 on UGI

To uncensor almost every model I can highly recommend 4 to 6 epochs over Guilherme34/uncensor which is what I use for most my uncensored models.
To make models made by Chinese companies politically unbiased I recommend 6 epochs of DPO over nbeerbower/GreatFirewall-DPO in addition to above uncensored dataset.

nicoboss

12 days ago

•

edited 12 days ago

Their models are awesome so just queued all of huihui-ai models to mradermacher queue. We in face already did almost all of them so I must have discovered huihui-ai and mass queued all their models in the past. I'm a big fan of any uncensored models so given how many obliterated models they make this isn't surprising. I also remember the awesome Pruned-Coder-411B models I quantized on nico1 and when going through thair models just realized that we missed DeepSeek-V3-0324-Pruned-Coder-411B so time to do it soon.

SerialKicked

12 days ago

•

edited 12 days ago

They're not pulling anything, uploading over a terabyte of data to HF can take a while, you know.

nicoboss

12 days ago

They're not pulling anything, uploading over a terabyte of data to HF can take a while, you know.

@SerialKicked Have you looked at old versions of the README.md or the discussion page? They in fact instead of uploading the model edited the README.me on an almost daily bases for the past month. While now it all looks reasonable it did not in the past. They are uploading 1.4 TB sized BF16 models of V3/R1 models within hours so they could easily upload this at any moment if they wanted to. Them not releasing it is not really what I'm complaining about. Its their decision if they want to make their model public or not. What I dislike is them abusing the model card of that non-released model to demanding ridiculous sums of money and likes to release it. Finetuning such massive models is expensive. I spent a lot of money for my finertunes myself yet I still believe the way they collect money is morally wrong. They could instead give early or exclusive access to their supporters or create donation goals on some dedicated crowdfunding platform instead of abusing the model page for this. I heavily disagree that 0.671 BTC = $56129.52 is a fair price to ask for. The cost to, uncensor such a model using much more expensive finetuning instead of abliteration is around $300 on RunPod so with the money they ask for they could properly uncensor 200 models and likely abliterate around 500 of them. Asking for 500x what creating the model actually costs is why I see it as a scam.

Let's quote https://huggingface.co/huihui-ai/DeepSeek-V3-abliterated/commit/6c92f7df5c69ac806860ddfe9ad7ff27d2c7dbb1

"The total control requires 671*3 likes or a total of 0.671 BTC in sponsorships."

In https://huggingface.co/huihui-ai/DeepSeek-V3-abliterated/commit/075d479b19d2a74f40fcfb3844ce32e5b460a043 they then changed it to:

"This link requires a total of 671 * 3 likes or a total of 0.671 BTC in sponsorships."

mradermacher

Owner 12 days ago

I must have discovered huihui-ai and mass queued all their models in the past.

That is an interesting conclusion. I queued practically every huihui model when it came out :)

SerialKicked

11 days ago

•

edited 11 days ago

No I didn't check their past edits until now. That said, detailing their costs (correctly or not) is a far cry from extortion or asking for ransom, though. ;)
I wouldn't want to be held accountable for stuff I edited after a post, so I'm not going to hold that against them. People make mistakes (nor everyone has English as a first language).

Anyway, not the place. Have a nice day.

mradermacher

Owner 11 days ago

•

edited 11 days ago

I went through all the revisions of the README.md, because I more or less saw every single of their model pages and hardly could believe they asked for likes (and money), but it's indeed true, they did so. I assume it was a slip, and I think the community told them it was wrong, and likely they changed their ways. Yes, anybody can make mistakes. We do like their models, too. But a strong negative/corrective reaction was the right thing here :)

mradermacher changed discussion status to closed 11 days ago

nicoboss

10 days ago

•

edited 10 days ago

Nice they are now actually uploading the real model: https://huggingface.co/huihui-ai/DeepSeek-V3-abliterated/commits/main
I also gave them the 100th like of this model because now they actually delivered, they deserve it. Their upload speed is quite fast so we can expect them to be done in a few hours. I will be really excited to quantize and try it out. Original V1/R1 was Chinese and r1-1776 American propaganda so hopefully that one finally answers my question without refusing to answer them or give me some propaganda due to politic or ethical reasons. In case anyone wonders: I did not forget about r1-1776 and we will probably quantize it as soon https://github.com/ggml-org/llama.cpp/pull/12725 is merged. I actually really don't like that we continue doing R1 based models like DeepSeek-V3-abliterated and DeepSeek-V3-0324-Pruned-Coder-411B while this is not merged but to be fair llama.cpp developers are working painfully slow on implementing this so we just had to move on and doing most of them. The missing MLA puts us in a very difficult position as basically all quants we do before this is merged will not be MLA compatible and given how massive MLA is this almost justifies a requantisation for all major R1 based models which due to their size is from a resource usage perspective almost not justifiable. Let's just hope llama.cpp doesn't compleately drop support for any old R1 model as was intitially planed in this PR. Especially with Qwen 3 and Llama 4 getting released soon there might be a busy time ahead of us. In any case I’m very relieved that we go the queue down as much we did.

nicoboss

9 days ago

•

edited 9 days ago

I downloaded the model but quantisation on this and any other DeepSeek V2/V3 based models should in my opinion be on halted until the discussion of https://github.com/ggml-org/llama.cpp/pull/12772 is concluded. It now again seams more likely that llama.cpp will drop support for all models not using MLA which would make all our current DeepSeek V2/V3 models useless but to be fair nobody would want to use them without MLA anyways as without MLA llama.cpp runs them with what I consider unusable speed.

nicoboss

9 days ago

•

edited 9 days ago

I now queued them like this to ensure they wait inside the queue until the MLA discussion concludes. Once it does, I will manually provide source GGUFs, bump priority, change worker to nico1 and force push to nico1.

llmc add 7777 si https://huggingface.co/huihui-ai/DeepSeek-V3-abliterated worker +waiting-for-mla
llmc add 7777 si https://huggingface.co/huihui-ai/DeepSeek-V3-0324-Pruned-Coder-411B worker +waiting-for-mla
llmc add force 7777 si https://huggingface.co/perplexity-ai/r1-1776 worker +waiting-for-mla

nicoboss

8 days ago

There also is https://github.com/ggml-org/llama.cpp/pull/12727 heavely impacting DeepSeek-V3 based models. So likely worth to wait for it to be merged as well.

mradermacher

Owner 8 days ago

I now queued them like this to ensure they wait inside the queue until the MLA discussion concludes.

Very good idea, that way, we can keep track of these more visibly (especially helpful when I am somewhat absent, like now).

So likely worth to wait for it to be merged as well.

Yeah, and hopefully, it's not far form being merged, either.