Really Like this model, used it as a simple test but it turned out great, it's creative, but sometimes gets horny as you already said, Made this account just to comment.

Ttimofeyka

Aug 29, 2024

Yes, and here are the possible directions for you:

Try using the 15-18B model (if any). I know that the number of parameters in the case of RP is a controversial thing, but why not take advantage of it, heh.
In SillyTavern with "Includes Names" marked, your model can work, but it doesn't work well enough. To be honest, I don't see any special solutions in this case. It might be worth trying to make the model better so that it can be used without "Includes Names" (current model, as you did say in the README, is unstable in this way).

Darkknight535

Owner Aug 29, 2024

•

edited Aug 30, 2024

@Ttimofeyka ["Include Names"] yea i've checked it, and tested two things about it, if a character outside of the roleplay character speaks and that chat remain in the context, it does speaks as any other character in future roleplay, play around for it will be to use example messages, that'll be more stable way, but i'll still try, the main way i could achieve this was by the base llama 3 8B instruct model.

Darkknight535

Owner Aug 30, 2024

This comment has been hidden

LustyLis

Sep 1, 2024

I have really enjoyed this model! I've been using it with DRY and XTC, and it out performs any nemo-based model for RP, in my opinion.

It leans a bit into ERP when it doesn't really suit the mood of the conversation, but with a few swipes, I find a response that's tamer.

I've tried the 15b version, but I honestly prefer this version because it's faster and because it seems to like higher contexts better than the 15b model.

Darkknight535

Owner Sep 2, 2024

•

edited Sep 2, 2024

Did you find any differences in the logic and reasoning of the model? (v1 and v2)?

Ttimofeyka

Sep 2, 2024

Did you find any differences in the logic and reasoning of the model? (v1 and v2)?

I don't know about logical, but it seems like 12B v1 better in the humanity of answers. This is achieved by cutting off the first messages (the model generates many paragraphs), and by temperature 1.2.

Darkknight535

Owner Sep 2, 2024

Did you find any differences in the logic and reasoning of the model? (v1 and v2)?

I don't know about logical, but it seems like 12B v1 better in the humanity of answers. This is achieved by cutting off the first messages (the model generates many paragraphs), and by temperature 1.2.

lol that's why i use max token of 100 and trim incomplete sentences.

Ttimofeyka

Sep 2, 2024

Did you find any differences in the logic and reasoning of the model? (v1 and v2)?

I don't know about logical, but it seems like 12B v1 better in the humanity of answers. This is achieved by cutting off the first messages (the model generates many paragraphs), and by temperature 1.2.

lol that's why i use max token of 100 and trim incomplete sentences.

A limit of 100 tokens may result in the deletion of important text. For me, it's more efficient to just crop the text manually

Darkknight535

Owner Sep 2, 2024

•

edited Sep 2, 2024

From what I observed, if your first message is short, it generally follows that pattern but attempts to generate more content. I set the maximum token length to 100-120, which is about 3-4 sentences. I prefer short responses, unlike Character AI, where the model's responses vary dynamically from long to very short.

Darkknight535

Owner Sep 2, 2024

I'm waiting for someone to create EXL2 quantization. So far, only GGUF has been made for this model, lol.

Ttimofeyka

Sep 2, 2024

I see. Anyway, it's important to understand difference between 12B and 15B v2.1 to improve 15B one. Speed difference is nothing, but creativity and humanity of answers is really problem I think.

Darkknight535

Owner Sep 2, 2024

so what you suggest? v1? v2? v2.1? (i wanna imporve the best one)

Ttimofeyka

Sep 2, 2024

so what you suggest? v1? v2? v2.1? (i wanna imporve the best one)

I'm preferring this model, but I want to see model with this creativity, but with less perplexity.

Darkknight535

Owner Sep 2, 2024

okay..

Ttimofeyka

Sep 2, 2024

okay..

I think the perplexity can be reduced if you increase the parameters of the model. But at the same time, the current 15B is simply not worth adding the extra billions of parameters. Maybe you did just make merge with the wrong basic models or something. Also, although you indicated that you use automatic cleaning of excessively long messages, a good model should write moderately - not 5-6 paragraphs, but from 1 to 4, depending on the context (preferring 2-3 as neutral maximum). Maybe more, but only when talking to other NPCs or while big/unique events.

Also, unlike the MN Celeste 12B, this model is more suitable for long-term chatting. Celeste (at least with my settings) responded, although in character to the characters, but he absolutely did not develop events further.

So, if you are interested in my opinion, the ideal model should have a slightly larger number of parameters than your model, but at the same time this increased number of parameters should pay off (at least the size of the context and the quality of the conversation). So far, I'm using this model as the main one, because, for all its flaws, it's the best thing in this range of models.

P.S. I recently started using your model with a 0.75 RoPE scaling, and I did not notice any deterioration in quality when increasing the context to 12k. Perhaps it will be useful for others.

Darkknight535

Owner Sep 2, 2024

P.S. I recently started using your model with a 0.75 RoPE scaling, and I did not notice any deterioration in quality when increasing the context to 12k. Perhaps it will be useful for others.

it can go till 16k i guess not tested, i still use it till 8k (and it's great, the creativity is fire.)
issue is i've tried increase the parameters as you know, but it made it dumber and plain
according to my testing it's good.

Ttimofeyka

Sep 2, 2024

P.S. I recently started using your model with a 0.75 RoPE scaling, and I did not notice any deterioration in quality when increasing the context to 12k. Perhaps it will be useful for others.

it can go till 16k i guess not tested, i still use it till 8k (and it's great, the creativity is fire.)

issue is i've tried increase the parameters as you know, but it made it dumber and plain

according to my testing it's good.

Maybe then you can look towards MoE? This is a cheaper increase in model parameters with less risk of deterioration of the result. And the performance will be higher than that of conventional models.

Darkknight535

Owner Sep 2, 2024

my gpu is not that good.

Darkknight535

Owner Sep 2, 2024

but so far i'll say, model is good, i'll try to increase it's parameters and do some tweaks again.. maybe wait 2-3 days, for v3

Ttimofeyka

Sep 2, 2024

Thanks:)

Darkknight535

Owner Sep 2, 2024

@Ttimofeyka you have discord?