It keeps spamming "assistant" at the end of sentences and then makes long rambling replies to itself
#2
by
6346y9uey
- opened
Stopping string issue
I've seen "assistant" and repeat itself with llama3 quantized exl2 models with tabbyAPI.
Wait for downstream bug fixes for other UIs.
latest main results: main: build = 2709 (40f74e4d)
F16
== Running in interactive mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to LLaMa.
- To return control without starting a new line, end your input with '/'.
- If you want to submit another line, end your input with '\'.
<|begin_of_text|>
> repeat this: I live.
I live.<|eot_id|>
> again.
I live.<|eot_id|>
>
IQ4_XS
<|begin_of_text|>
> repeat this once: I like soccer.
I like soccer.<|eot_id|>
>
from my experience with some 70b l3 gguf quants, such issues disappear when using the full format with user/assistant format, which meta states is important for good results:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
{{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>
{{ user_message }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
absence or lack of new lines is also important. double newlines are not necessary ime, but that's how meta specified it