Praise and Criticism

#23

by ChuckMcSneed - opened Aug 15

Aug 15

Praise

You've cooked really hard with this one. While Mixtral 8x22b felt a bit like a sidegrade to Miqu(Mistral-Medium), this one is definitely an upgrade.

Intelligence

This model feels really intelligent, able to pick up the most subtle details. Makes it worth it over Command-r-plus. Really great at coding too.

No positivity bias

While Miqu was overly positive and never dared to do or say anything bad, even if it was in character, this one can, which allows the stories to be more immersive.

Not castrated like LLAMA

LLAMA 3+ was filtered to hell in the base, which made it outright unusable to me and many others. It's okay as an assistant, but no more beyond that. Largestral btfos even 405b due to this, in my opinion.

Criticism

Now, the flaws.

GPTisms

Lots of overused phrases from GPT4 training data, not as bad as with some other models, but still bad. Do you know why people liked Command-r-plus? Because it had none of those, it felt different. Humans really hate GPTisms, they are like a big, fat sign saying "look, I'm ChatGPT!", feels really soulless. Nobody likes that shiver slop.

Repetition

A minor flaw solved with samplers, but it feels like the model picks up patterns a bit too quickly? Not a very big problem, just an observation.

Overfitting

That's one of the features of your past models. I don't know why you do it, but I had to pull up the temperature to 3 to break overfitted sentences, which I shouldn't be doing.

pandora-s

Mistral AI_ org Aug 16

@ChuckMcSneed Hi!

Thank you so much for all the feedback!

Could you share a bit more on the overfitted aspect? What sentences do you found overfitted?

ChuckMcSneed

Aug 16

@pandora-s Mostly phrases which are overused by OpenAI's GPT models. Here are some:

a mix of X and Y
Ah,
tapestry
mischievous
smirk
chuckle
husky voice
barely above a whisper
couldn't help but
shivers
maybe, just maybe
a testament to
cold and calculating
growl
lean in
LOTS of phrases with eyes

pandora-s

Mistral AI_ org Aug 16

I see, could you also share some feedback related to those "gptisms"? 🤔

gghfez

Aug 16

•

edited Aug 16

"bustling" has now become my least favorite word

That being said, this is my favorite open weights model now.

ChuckMcSneed

Aug 16

Here are some more, as you can see, people hate them.

pandora-s

Mistral AI_ org Aug 16

Thanks a lot for all the feedback, if any other issues feel free to share! We are always open to feedback to improve our models!

MB7977

Aug 16

Just to add some more to the praise side of things here, I really appreciate that this model just zeroes in on what you tell it to do. No fluff, no introducing things when you don't ask it to, and not much in the way of convoluted wrapping up sentences either. At the same time, if you ask it to write at length, it will do so, and in a focused way. This flexibility is a real achievement. I have a bunch of hard, long context tasks and this is the first open weights model to nail them all. It's the best locally runnable model, IMO. Llama 3 and 3.1 might edge it in some benchmarks but in actual, everyday use, it's top dog.

BigHuggyD

Aug 16

...ditto... this model has replaced Cohere CR+ as my daily driver.. More precise than CR+ in the early context depth, but it does fall off quicker than CR+ after 32k+ (my own impressions verified by RULER). Follows instructions better than CR+
I agree with @ChuckMcSneed that some sampler judo is needed to squash some of the less desirable aspects, but it at least has the capability to do so. CR+, you could neutralize all samplers but temp at 1 and get something fresh and coherent over and over with the same opening prompt.
All in all well done and the best open-weight model on the block.

chrisilang

19 days ago

Trying to use this model as a major content writer, but currently, as already mentioned it used too overused AI words.

more sources

thank you

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment