Probably The Best Multishot Model Series So Far
It seems to understand the context far better than any other model that I've tried, even several 33B. Thanks for the conversion.
Great to hear. Are you finding that it won't stop, and keeps answering itself? I found that I had to configure a stopping string in the UI as it doesn't seem to have any stopping token implemented
Great to hear. Are you finding that it won't stop, and keeps answering itself? I found that I had to configure a stopping string in the UI as it doesn't seem to have any stopping token implemented
You mean, it talks too much? 😂
It seems to understand the context far better than any other model that I've tried, even several 33B. Thanks for the conversion.
Hey man can you plz share the code how did you load the model in jupyter file, as I am still learning to it....
Thank you
Great to hear. Are you finding that it won't stop, and keeps answering itself? I found that I had to configure a stopping string in the UI as it doesn't seem to have any stopping token implemented
Yeah I've been using a stopping strings by default since the original Vicuna so I didn't notice.
Yes, model is good. But it hallucinates and looses context too often at scale.
Haven't seen such things happening at all on other 13b models.
What I mean is: Model creates text on "Golang in-memory database" and switches to "Machine learning" like it was talking about ML all the way.
One more thing, all llama-based models I've tried were not really aware of Google as a search engine and preferred Bing. This one is ok with Google.
Same gripe, looks like it has an attractor to switch context to ML. Asked to write interferometry code, it switched to MNIST half-way. Great speed and initial couple thousand tokens though.