what template are we using?

by cognitivetech - opened Aug 6, 2024

Aug 6, 2024

FROM ../meta-llama-3.1-8b-instruct-abliterated.Q8_0.gguf
TEMPLATE """
{{if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}{{ end }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>
"""
PARAMETER num_ctx 20000
PARAMETER num_predict 2000
PARAMETER num_gpu -1
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"

with this modelfile I'm getting repetitive run-on..

anyone using a different template than this?

ManniX-ITA

Aug 7, 2024

https://ollama.com/mannix/llama3.1-8b-abliterated

I'm using 2 different templates, one with tools and one without
Didn't test specifically this scenario

cognitivetech

Aug 7, 2024

great! that's much better than mine.. unfortunately still not getting the results from this model. having much better success with llm3.1 8b instruct itself...

I shouldn't be overfilling the context? very strange to me, I'll tinker and report back

lazman

Nov 24, 2024

great! that's much better than mine.. unfortunately still not getting the results from this model. having much better success with llm3.1 8b instruct itself...

I shouldn't be overfilling the context? very strange to me, I'll tinker and report back

Yea, I'm having the same issues with openchat. But I'm attempting to design my own interface. That said, I haven't attempted to implement a template yet. I've only just heard of them recently. But I really do need to do that, cuz as it sits, almost all my interface will output for the most part, is massive-max-token info-dumps. Also a massive lack in prompt coherence. And it takes 3-5 minutes for a single response, even from simple queries.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment