[Feedback]

#1
by Darkknight535 - opened

[Feedback here]

Darkknight535 pinned discussion

Thanks, very good model. If you will make v2 version, I will download it:)

@Ttimofeyka Glad you like it.

Really Like this model, used it as a simple test but it turned out great, it's creative, but sometimes gets horny as you already said, Made this account just to comment.

Yes, and here are the possible directions for you:

  • Try using the 15-18B model (if any). I know that the number of parameters in the case of RP is a controversial thing, but why not take advantage of it, heh.
  • In SillyTavern with "Includes Names" marked, your model can work, but it doesn't work well enough. To be honest, I don't see any special solutions in this case. It might be worth trying to make the model better so that it can be used without "Includes Names" (current model, as you did say in the README, is unstable in this way).

@Ttimofeyka ["Include Names"] yea i've checked it, and tested two things about it, if a character outside of the roleplay character speaks and that chat remain in the context, it does speaks as any other character in future roleplay, play around for it will be to use example messages, that'll be more stable way, but i'll still try, the main way i could achieve this was by the base llama 3 8B instruct model.

This comment has been hidden

I have really enjoyed this model! I've been using it with DRY and XTC, and it out performs any nemo-based model for RP, in my opinion.

It leans a bit into ERP when it doesn't really suit the mood of the conversation, but with a few swipes, I find a response that's tamer.

I've tried the 15b version, but I honestly prefer this version because it's faster and because it seems to like higher contexts better than the 15b model.

Did you find any differences in the logic and reasoning of the model? (v1 and v2)?

Did you find any differences in the logic and reasoning of the model? (v1 and v2)?

I don't know about logical, but it seems like 12B v1 better in the humanity of answers. This is achieved by cutting off the first messages (the model generates many paragraphs), and by temperature 1.2.

Did you find any differences in the logic and reasoning of the model? (v1 and v2)?

I don't know about logical, but it seems like 12B v1 better in the humanity of answers. This is achieved by cutting off the first messages (the model generates many paragraphs), and by temperature 1.2.

lol that's why i use max token of 100 and trim incomplete sentences.

Did you find any differences in the logic and reasoning of the model? (v1 and v2)?

I don't know about logical, but it seems like 12B v1 better in the humanity of answers. This is achieved by cutting off the first messages (the model generates many paragraphs), and by temperature 1.2.

lol that's why i use max token of 100 and trim incomplete sentences.

A limit of 100 tokens may result in the deletion of important text. For me, it's more efficient to just crop the text manually

From what I observed, if your first message is short, it generally follows that pattern but attempts to generate more content. I set the maximum token length to 100-120, which is about 3-4 sentences. I prefer short responses, unlike Character AI, where the model's responses vary dynamically from long to very short.

I'm waiting for someone to create EXL2 quantization. So far, only GGUF has been made for this model, lol.

I see. Anyway, it's important to understand difference between 12B and 15B v2.1 to improve 15B one. Speed difference is nothing, but creativity and humanity of answers is really problem I think.

so what you suggest? v1? v2? v2.1? (i wanna imporve the best one)

so what you suggest? v1? v2? v2.1? (i wanna imporve the best one)

I'm preferring this model, but I want to see model with this creativity, but with less perplexity.

okay..

okay..

I think the perplexity can be reduced if you increase the parameters of the model. But at the same time, the current 15B is simply not worth adding the extra billions of parameters. Maybe you did just make merge with the wrong basic models or something. Also, although you indicated that you use automatic cleaning of excessively long messages, a good model should write moderately - not 5-6 paragraphs, but from 1 to 4, depending on the context (preferring 2-3 as neutral maximum). Maybe more, but only when talking to other NPCs or while big/unique events.

Also, unlike the MN Celeste 12B, this model is more suitable for long-term chatting. Celeste (at least with my settings) responded, although in character to the characters, but he absolutely did not develop events further.

So, if you are interested in my opinion, the ideal model should have a slightly larger number of parameters than your model, but at the same time this increased number of parameters should pay off (at least the size of the context and the quality of the conversation). So far, I'm using this model as the main one, because, for all its flaws, it's the best thing in this range of models.

P.S. I recently started using your model with a 0.75 RoPE scaling, and I did not notice any deterioration in quality when increasing the context to 12k. Perhaps it will be useful for others.

P.S. I recently started using your model with a 0.75 RoPE scaling, and I did not notice any deterioration in quality when increasing the context to 12k. Perhaps it will be useful for others.

  • it can go till 16k i guess not tested, i still use it till 8k (and it's great, the creativity is fire.)
  • issue is i've tried increase the parameters as you know, but it made it dumber and plain
  • according to my testing it's good.

P.S. I recently started using your model with a 0.75 RoPE scaling, and I did not notice any deterioration in quality when increasing the context to 12k. Perhaps it will be useful for others.

  • it can go till 16k i guess not tested, i still use it till 8k (and it's great, the creativity is fire.)
  • issue is i've tried increase the parameters as you know, but it made it dumber and plain
  • according to my testing it's good.

Maybe then you can look towards MoE? This is a cheaper increase in model parameters with less risk of deterioration of the result. And the performance will be higher than that of conventional models.

my gpu is not that good.

but so far i'll say, model is good, i'll try to increase it's parameters and do some tweaks again.. maybe wait 2-3 days, for v3

@Ttimofeyka you have discord?

my id : rgojosatoru

@Ttimofeyka you have discord?

my id : rgojosatoru

Yeah. Id ttimofeyka

Sign up or log in to comment