[Suggestion] Proper quantization labelling, attribution

by finis-est - opened 5 days ago

5 days ago

•

Hey, thanks for coming up with EXL2 quants for Snowdrop!

Would be cool to see this listed under Snowdrop's quantizations properly, similar to how these ones do it, will be useful for people looking at options from the Snowdrop model page too:

Just a suggestion, here's how the other quanters do it in the model card, I think changing the base model to Snowdrop might be enough?
https://huggingface.co/mradermacher/QwQ-Snowdrop-GGUF/edit/main/README.md
https://huggingface.co/mradermacher/QwQ-Snowdrop-i1-GGUF/edit/main/README.md
https://huggingface.co/janboe91/QwQ-32B-Snowdrop-v0-4bit/edit/main/README.md
https://huggingface.co/DevQuasar/trashpanda-org.QwQ-32B-Snowdrop-v0-GGUF/edit/main/README.md

Again, thanks for the quant!

FrenzyBiscuit

Ready.Art org 5 days ago

I added snowdrop to https://huggingface.co/ReadyArt/QwQ-32B-Snowdrop-v0_EXL2_4.0bpw_H8 under the base_model for 4.0, which is what I think you wanted?

Can you confirm that's what you're asking for? If so I'll go ahead and apply the changes and slowly (probably over the next week or two) do the same for our other 300+ models.

It's not my intention to take credit for the models, I just don't have an automated process of changing the model cards and 300+ models are a lot of models to keep track of. Usually, I just copy the repository, quant them and upload the quants.

I'll look into a way of changing the way I do things, maybe implementing my own small model card instead of the original.

finis-est

5 days ago

Hmm, checked Snowdrop and the quant was still getting listed as a merge, wonder what's up with that...

No worries, figured it was something automated and it's definitely a hassle to do everything manually. Thanks for looking into it, and for your efforts in quanting, appreciate it.

FrenzyBiscuit

Ready.Art org 5 days ago

Hmm, checked Snowdrop and the quant was still getting listed as a merge, wonder what's up with that...

No worries, figured it was something automated and it's definitely a hassle to do everything manually. Thanks for looking into it, and for your efforts in quanting, appreciate it.

I think quanters like Bartowski automate the entire thing. I'm not that fancy, but I'll definitely look into changing how I do things and updating the existing models once I do so. It's probably a good idea anyway, as the existing model cards have media (i.e. images) from their creators. Thank you for bringing this up.

finis-est

5 days ago

Cool, no worries, if anything I'm grateful you heard us out. Thanks and looking forward to it, no rush though!

finis-est changed discussion status to closed 5 days ago

Anthonyg5005

3 days ago

I think to get it working properly you only need to have base_model: trashpanda-org/QwQ-32B-Snowdrop-v0 for the models
however, since huggingface doesn't directly support exl2 configs you need to also add base_model_relation: quantized to force the repo to be for a quantized model

FrenzyBiscuit

Ready.Art org 3 days ago

I think to get it working properly you only need to have base_model: trashpanda-org/QwQ-32B-Snowdrop-v0 for the models
however, since huggingface doesn't directly support exl2 configs you need to also add base_model_relation: quantized to force the repo to be for a quantized model

This does appear to be the case. I'll figure out what I'm going to do with the 300+ models we already have... going to be fun.

FrenzyBiscuit

Ready.Art org 2 days ago

Can you confirm that's what you're looking for? I edited the model card. I think it's correct.

I'm going to flesh out a custom model card for my quants over the weekend and then start (slowly) replacing all 300+ of them.. going to be a blast.

FrenzyBiscuit changed discussion status to open 2 days ago

Anthonyg5005

2 days ago

would have to wait until @finis-est sees this, I'm not part of their org, I just wanted to help from the exllama community.
also, not sure what your workflow is but for future reference I'd also recommend adding the measurement.txt file to the files if you can, it lets anyone be able to skip the measure pass by reusing your pass. for example, if I wanted to make my own 3.5bpw quant I could use that and skip the whole first half of the quanting process. it's not necessary but it would be helpful for a couple people.
thanks for your quants though, they've been helpful for trying out new finetunes quickly

finis-est

1 day ago

@FrenzyBiscuit Looking good now, can see it getting listed as a quant correctly. Thanks again!

FrenzyBiscuit

Ready.Art org 1 day ago

@FrenzyBiscuit Looking good now, can see it getting listed as a quant correctly. Thanks again!

No, thank you for calling me out on it. I was being lazy.

Also, had no idea about the quant thing under the model card, so again thank you :)

FrenzyBiscuit changed discussion status to closed 1 day ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment