GGUF
English
Inference Endpoints

Only the q4 model works, the other suffer from bug in convert.py that made them

#1
by ubuntulover - opened

@ubuntulover for me the f32 version works fine, I did some quantizations of it, it works fine too, I'm using the latest llama.cpp version.

Sign up or log in to comment