big thanks to lore for the 8xH100 gpus

gptq

4 bits, 0.1 damp, 128 group size, true sequential

training

base model is meta llama 3 8b instruct trained on pippa then i trained that model on limarp, both at 32k context for 2 epochs each

gen settings

i would start with every sampler off and temperature at 1 and just make min p 0.05, i got good prompts from this but u can also try to gen settings from shori which are copy pasted below

  • Main choice (may have repetition issues)
    • Temperature: 1.0; Min-P: 0.05-0.10; Presence Penalty: 0.35-0.45
  • Alternative 1 (appears to solve repetition issues while being coherent, but reponses might possibly be less truthful)
    • Temperature: 2.40-2.50; Min-P: 0.40; Frequency penalty: 0.10-0.15; Temperature last.
  • Alternative 2
    • Mirostat type: 2, Mirostat Tau: 2.80-3.00; Mirostat Eta: 0.0175-0.0200; neutralize or disable all other samplers

prompting

use the llama 3 instruct format

<|eot_id|> as stopping sequence/string/token

ST jsons: instruct context

agnaistic prompt:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>{{#if system}}<|begin_of_text|><|start_header_id|>system<|end_header_id|>{{system}}<|eot_id|>{{/if}}Write {{char}}'s next reply in a fictional roleplay chat between {{#each bot}}{{.name}}, {{/each}}{{char}} and {{user}}.

{{char}}'s Persona: {{personality}}

{{#if memory}}
Important details:
{{memory}}
{{/if}}

{{#if example_dialogue}}This is how {{char}} should talk:
{{example_dialogue}}{{/if}}

This scenario of the conversation: {{scenario}}

Then the roleplay chat between {{#each bot}}{{.name}}, {{/each}}{{char}} and {{user}} begins.<|eot_id|>

{{#each msg}}{{#if .isbot}}<|start_header_id|>response<|end_header_id|>{{/if}}{{#if .isuser}}<|start_header_id|>user<|end_header_id|>{{/if}}{{.name}}: {{.msg}}<|eot_id|>
{{/each}}
{{#if ujb}}<|begin_of_text|><|start_header_id|>system<|end_header_id|>{{ujb}}<|eot_id|>{{/if}}
<|start_header_id|>response<|end_header_id|>{{post}}
Downloads last month
12
Safetensors
Model size
1.99B params
Tensor type
FP16
·
I32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train ludis/tsukasa-llama-3-8b-qlora-gptq

Collection including ludis/tsukasa-llama-3-8b-qlora-gptq