How to merge the file parts ?
#1
by
AliceThirty
- opened
I joined the Q_4_M files into one like this:
with open(output_path, "wb") as outfile:
for part_file in part_files:
part_path = os.path.join(directory, part_file)
with open(part_path, "rb") as infile:
outfile.write(infile.read())
The resulting file is 73,219,623,072
bytes and its SHA256 is ae69ba5e00f9f53731941f1aa2e5d0101ca41581c59bf7d8b76714ae62c7d3b6
But when I load it with the last version of Koboldcpp_cu12.exe (version 1.78) it crashes immediatly and says
llama_model_load: error loading model: invalid split file: C:\models\Behemoth-12T-'¿ÄTllama_load_model_from_file: failed to load model
Traceback (most recent call last):
File "koboldcpp.py", line 4720, in <module>
main(parser.parse_args(),start_server=True)
File "koboldcpp.py", line 4344, in main
loadok = load_model(modelname)
File "koboldcpp.py", line 900, in load_model
ret = handle.load_model(inputs)
OSError: exception: access violation reading 0x00000000000018A4
[18352] Failed to execute script 'koboldcpp' due to unhandled exception!
This exception doesn't happen with other mistral 123B Q_4_M quantization, such as mistral itself, magnum, luminum, lumimaid...
yeah don't do it that way, you need to use the llama-split method, or ideally just don't merge them at all. If you just attempt to load the first part, koboldcpp should then automatically load the rest
Thank you it worked