Triangle104/MN-Chunky-Lotus-12B-Q4_K_S-GGUF

This model was converted to GGUF format from FallenMerick/MN-Chunky-Lotus-12B using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.


Model details:

I had originally planned to use this model for future/further merges, but decided to go ahead and release it since it scored rather high on my local EQ Bench testing (79.58 w/ 100% parsed @ 8-bit). Bear in mind that most models tend to score a bit higher on my own local tests as compared to their posted scores. Still, its the highest score I've personally seen from all the models I've tested. Its a decent model, with great emotional intelligence and acceptable adherence to various character personalities. It does a good job at roleplaying despite being a bit bland at times.

Overall, I like the way it writes, but it has a few formatting issues that show up from time to time, and it has an uncommon tendency to paste walls of character feelings/intentions at the end of some outputs without any prompting. This is something I hope to correct with future iterations.

This is a merge of pre-trained language models created using mergekit.

Merge Method

This model was merged using the TIES merge method.

Models Merged

The following models were included in the merge:

Epiculous/Violet_Twilight-v0.2
nbeerbower/mistral-nemo-gutenberg-12B-v4
flammenai/Mahou-1.5-mistral-nemo-12B

Configuration

The following YAML configuration was used to produce this model:

models:

  • model: Epiculous/Violet_Twilight-v0.2 parameters: weight: 1.0 density: 1.0
  • model: nbeerbower/mistral-nemo-gutenberg-12B-v4 parameters: weight: 1.0 density: 0.54
  • model: flammenai/Mahou-1.5-mistral-nemo-12B parameters: weight: 1.0 density: 0.26 merge_method: ties base_model: TheDrummer/Rocinante-12B-v1.1 parameters: normalize: true dtype: bfloat16

The idea behind this recipe was to take the long-form writing capabilities of Gutenberg, curtail it a bit with the very short output formatting of Mahou, and use Violet Twilight as an extremely solid roleplaying foundation underneath. Rocinante is used as the base model in this merge in order to really target the delta weights from Gutenberg, since those seemed to have the highest impact on the resulting EQ of the model.

Special shoutout to @matchaaaaa for helping with testing, and for all the great model recommendations. Also, for just being an all around great person who's really inspired and motivated me to continue merging and working on models.


Use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo Triangle104/MN-Chunky-Lotus-12B-Q4_K_S-GGUF --hf-file mn-chunky-lotus-12b-q4_k_s.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo Triangle104/MN-Chunky-Lotus-12B-Q4_K_S-GGUF --hf-file mn-chunky-lotus-12b-q4_k_s.gguf -c 2048

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.

git clone https://github.com/ggerganov/llama.cpp

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run inference through the main binary.

./llama-cli --hf-repo Triangle104/MN-Chunky-Lotus-12B-Q4_K_S-GGUF --hf-file mn-chunky-lotus-12b-q4_k_s.gguf -p "The meaning to life and the universe is"

or

./llama-server --hf-repo Triangle104/MN-Chunky-Lotus-12B-Q4_K_S-GGUF --hf-file mn-chunky-lotus-12b-q4_k_s.gguf -c 2048
Downloads last month
4
GGUF
Model size
12.2B params
Architecture
llama

4-bit

Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for Triangle104/MN-Chunky-Lotus-12B-Q4_K_S-GGUF

Quantized
(10)
this model