bartowski/gemma-1.1-7b-it-GGUF · How did you convert it?

Jun 16, 2024

•

edited Jun 17, 2024

I tried to convert grom the original model to get the f16 model.
But I have an error:
python llama.cpp/convert-hf-to-gguf.py --outtype f16 /content/gemma-1.1-7b-it --outfile /content/gemma-1.1-7b-it.f16.gguf

INFO:hf-to-gguf:Loading model: gemma-1.1-7b-it
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
....
  File "/content/llama.cpp/gguf-py/gguf/gguf_writer.py", line 166, in add_key_value
    raise ValueError(f'Duplicated key name {key!r}')
ValueError: Duplicated key name 'tokenizer.chat_template'

ZeroWw

Jun 16, 2024

•

edited Jun 17, 2024

I commented the line in gguf writer for now.. we'll see.
Commenting those instruction causes that the second duplicate will overwrite the previous one.

def add_key_value(self, key: str, val: Any, vtype: GGUFValueType) -> None:
    #if key in self.kv_data:
    #    raise ValueError(f'Duplicated key name {key!r}')

ZeroWw

Jun 17, 2024

https://huggingface.co/ZeroWw/Test/tree/main

bartowski

Owner Jun 17, 2024

yeah someone broke the conversion for gemma recently and it needs to be fixed

ZeroWw

Jun 17, 2024

Any idea how to run gemma in llama.cpp ?
I tried with the above models, the model answers in llama.cpp UI (server) but aftter the answer it continues by itself.

bartowski

Owner Jun 17, 2024

•

edited Jun 17, 2024

Need to specify proper stop tokens I would guess