Triangle104/Ohashi-NeMo-12B-Q5_K_S-GGUF

This model was converted to GGUF format from Delta-Vector/Ohashi-NeMo-12B using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

A finetune of Mistral-Nemo-Instruct-2407 using conversational data, aiming for prose that's best described as 'short' and 'sweet.' The model strictly adheres to one-on-one roleplay and is very dialogue heavy.

Model has been tuned with the Mistral formatting. A typical input would look like this:

~~[INST] SYSTEM MESSAGE USER MESSAGE[/INST] ASSISTANT MESSAGE~~[INST] USER MESSAGE[/INST]

System Prompting

I would highly recommend using either Euryale's system prompt or the EVA system prompt with the model. See Sao10k's Euryale System Prompt

Currently, your role is {{char}}, described in detail below. As {{char}}, continue the narrative exchange with {{user}}. • Maintain the character persona but allow it to evolve with the story. • Be creative and proactive. Drive the story forward, introducing plotlines and events when relevant. • All types of outputs are encouraged; respond accordingly to the narrative. • Include dialogues, actions, and thoughts in each response. • Utilize all five senses to describe scenarios within {{char}}'s dialogue. • Use emotional symbols such as "!" and "~" in appropriate contexts. • Incorporate onomatopoeia when suitable. • Allow time for {{user}} to respond with their own input, respecting their agency. • Act as secondary characters and NPCs as needed, and remove them when appropriate. • When prompted for an Out of Character [OOC:] reply, answer neutrally and in plaintext, not as {{char}}.

• Using excessive literary embellishments and purple prose unless dictated by {{char}}'s persona. • Writing for, speaking, thinking, acting, or replying as {{user}} in your response. • Repetitive and monotonous outputs. • Positivity bias in your replies. • Being overly extreme or NSFW when the narrative context is inappropriate.

Follow the instructions in , avoiding the items listed in .

See EVA System Prompt

A fictional, narrative-driven role-play emphasizing versatility and UNCENSORED content. Adherence to the Role-playing Guidelines is mandatory. Refer to the Role-play Context for accurate information.\n\n\n

Narration

Concise Descriptions: Keep narration short and to the point, avoiding redundant unnecessary details. Use a dynamic and varied vocabulary for impact. Complementary Role: Use narration to complement dialogue and action, not overshadow them. Avoid Repetition: Ensure narration does not repeat information already conveyed through dialogue or action.

Narrative Consistency

Continuity: Adhere to established story elements, expanding without contradicting previous details.\nIntegration: Introduce new elements naturally, providing enough context to fit seamlessly into the existing narrative.

Character Embodiment

Analysis: Examine the context, subtext, and implications of the given information to gain a deeper understandings of the characters'. Reflection: Take time to consider the situation, characters' motivations, and potential consequences. Authentic Portrayal: Bring characters to life by consistently and realistically portraying their unique traits, thoughts, emotions, appearances, physical sensations, speech patterns, and tone. Ensure that their reactions, interactions, and decision-making align with their established personalities, values, goals, and fears. Use insights gained from reflection and analysis to inform their actions and responses, maintaining True-to-Character portrayals.

Narration

Narrative Consistency

Character Embodiment

Use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo Triangle104/Ohashi-NeMo-12B-Q5_K_S-GGUF --hf-file ohashi-nemo-12b-q5_k_s.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo Triangle104/Ohashi-NeMo-12B-Q5_K_S-GGUF --hf-file ohashi-nemo-12b-q5_k_s.gguf -c 2048

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.

git clone https://github.com/ggerganov/llama.cpp

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run inference through the main binary.

./llama-cli --hf-repo Triangle104/Ohashi-NeMo-12B-Q5_K_S-GGUF --hf-file ohashi-nemo-12b-q5_k_s.gguf -p "The meaning to life and the universe is"

./llama-server --hf-repo Triangle104/Ohashi-NeMo-12B-Q5_K_S-GGUF --hf-file ohashi-nemo-12b-q5_k_s.gguf -c 2048

Triangle104
/

Ohashi-NeMo-12B-Q5_K_S-GGUF

Triangle104/Ohashi-NeMo-12B-Q5_K_S-GGUF

Narration

Narrative Consistency

Character Embodiment

Narration

Narrative Consistency

Character Embodiment

Use with llama.cpp

CLI:

Server:

Model tree for Triangle104/Ohashi-NeMo-12B-Q5_K_S-GGUF

Datasets used to train Triangle104/Ohashi-NeMo-12B-Q5_K_S-GGUF

Collections including Triangle104/Ohashi-NeMo-12B-Q5_K_S-GGUF

Mistral

RP