RozGrov's picture
Update README.md
a69afe2 verified
|
raw
history blame
2.8 kB
metadata
tags:
  - merge
  - mergekit
  - lazymergekit

NemoDori-v0.1-12B-MS

NemoDori-v0.1-12B-MS is a MODEL STOCK merge of the following models using LazyMergekit (see below for merge configuration. All credits to them.)

This is my 'first' merge model, just for testing purpose. I don't know what I'm doing, honestly...

My experience using this in SillyTavern:

  • It advance the story slowly, responding to the last message quite nicely.
  • Creativity is good, sometimes surprised me with the similar response that I'd like to get.
  • It may skips time when the last message includes word(s) that resembles a promise (or literally time).
  • Sometimes respond with a long response, but it's kinda adapt to the overall roleplay message, i think...

Prompt and Preset

ChatML works best so far. Llama3 and Mistral prompts work, but sometimes speaks for you. (ChatML may speak for you, but not that often, just re-generate.)

I use context and instruct from here (Credits to Virt-io.)

This is the preset I use for SillyTavern, it should be good enough. Tweak to your hearts content:

  • temp can go higher (i stopped at 2),
  • skip special tokens may or may not needed. If it respond with "assistant" or "user" at the end, disable the checkbox, it should matters. (i did get it from my first couple of tries, but now, no more. i dunno man...)
  • context length so far still coherence at 28k tokens, from my own testing.
  • everything else is... just fine, as long as you're not forcing it.

🧩 Configuration

models:
  - model: Sao10K/MN-12B-Lyra-v1
  - model: Fizzarolli/MN-12b-Rosier-v1
  - model: MarinaraSpaghetti/Nemomix-v4.0-12B
  - model: aetherwiing/MN-12B-Starcannon-v2
merge_method: model_stock
base_model: aetherwiing/MN-12B-Starcannon-v2
dtype: bfloat16

💻 Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "RozGrov/NemoDori-v0.1-12B-MS"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])