Still doesn't work in LM Studio
This model still doesn't work in LM Studio and it is AFTER the the commit which was supposed to make it work.
Without even attempting to describe your problems, this is pretty useless.
Without even attempting to describe your problems, this is pretty useless.
Details are already described here:
https://huggingface.co/mradermacher/Ling-lite-GGUF/discussions/2
and here:
https://huggingface.co/mradermacher/Ling-lite-GGUF/discussions/1
After these two threads practically describing it all in detail, I thought everyone would know what I'm talking about, but if you insist, here goes my own log just in case:
D:\AI_Models\mradermacher\Ling-lite-GGUF\Ling-lite.Q4_K_M.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = bailingmoe
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Ling Lite
llama_model_loader: - kv 3: general.size_label str = 64x1.5B
llama_model_loader: - kv 4: general.license str = mit
llama_model_loader: - kv 5: general.tags arr[str,1] = ["text-generation"]
llama_model_loader: - kv 6: bailingmoe.block_count u32 = 28
llama_model_loader: - kv 7: bailingmoe.context_length u32 = 16384
llama_model_loader: - kv 8: bailingmoe.embedding_length u32 = 2048
llama_model_loader: - kv 9: bailingmoe.feed_forward_length u32 = 5632
llama_model_loader: - kv 10: bailingmoe.attention.head_count u32 = 16
llama_model_loader: - kv 11: bailingmoe.attention.head_count_kv u32 = 4
llama_model_loader: - kv 12: bailingmoe.rope.freq_base f32 = 600000.000000
llama_model_loader: - kv 13: bailingmoe.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 14: bailingmoe.expert_used_count u32 = 6
llama_model_loader: - kv 15: bailingmoe.rope.dimension_count u32 = 128
llama_model_loader: - kv 16: bailingmoe.rope.scaling.type str = none
llama_model_loader: - kv 17: bailingmoe.leading_dense_block_count u32 = 0
llama_model_loader: - kv 18: bailingmoe.vocab_size u32 = 126464
llama_model_loader: - kv 19: bailingmoe.expert_feed_forward_length u32 = 1408
llama_model_loader: - kv 20: bailingmoe.expert_weights_scale f32 = 1.000000
llama_model_loader: - kv 21: bailingmoe.expert_count u32 = 64
llama_model_loader: - kv 22: bailingmoe.expert_shared_count u32 = 2
llama_model_loader: - kv 23: bailingmoe.expert_weights_norm bool = true
llama_model_loader: - kv 24: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 25: tokenizer.ggml.pre str = bailingmoe
[2025-04-03 04:04:08][DEBUG] llama_model_loader: - kv 26: tokenizer.ggml.tokens arr[str,126464] = ["!", "\"", "#", "$", "%", "&", "'", ...
[2025-04-03 04:04:08][DEBUG] llama_model_loader: - kv 27: tokenizer.ggml.token_type arr[i32,126464] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
[2025-04-03 04:04:08][DEBUG] llama_model_loader: - kv 28: tokenizer.ggml.merges arr[str,125824] = ["Ġ Ġ", "Ġ t", "i n", "Ġ a", "h e...
llama_model_loader: - kv 29: tokenizer.ggml.bos_token_id u32 = 126080
llama_model_loader: - kv 30: tokenizer.ggml.eos_token_id u32 = 126081
llama_model_loader: - kv 31: tokenizer.ggml.padding_token_id u32 = 126081
llama_model_loader: - kv 32: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 33: tokenizer.ggml.add_eos_token bool = false
llama_model_loader: - kv 34: tokenizer.chat_template str = {% for message in messages %}{% set r...
llama_model_loader: - kv 35: general.quantization_version u32 = 2
llama_model_loader: - kv 36: general.file_type u32 = 15
llama_model_loader: - kv 37: general.url str = https://huggingface.co/mradermacher/L...
llama_model_loader: - kv 38: mradermacher.quantize_version str = 2
llama_model_loader: - kv 39: mradermacher.quantized_by str = mradermacher
llama_model_loader: - kv 40: mradermacher.quantized_at str = 2025-03-31T05:37:59+02:00
llama_model_loader: - kv 41: mradermacher.quantized_on str = kaos
llama_model_loader: - kv 42: general.source.url str = https://huggingface.co/inclusionAI/Li...
llama_model_loader: - kv 43: mradermacher.convert_type str = hf
llama_model_loader: - type f32: 85 tensors
llama_model_loader: - type q5_0: 14 tensors
llama_model_loader: - type q8_0: 14 tensors
llama_model_loader: - type q4_K: 225 tensors
llama_model_loader: - type q6_K: 29 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q4_K - Medium
print_info: file size = 10.40 GiB (5.32 BPW)
load_hparams: ----------------------- n_expert_used = 6
[2025-04-03 04:04:09][DEBUG] Failed to process regex: ''(?:[sSdDmMtT]|[lL][lL]|[vV][eE]|[rR][eE])|[^\r\n\p{L}\p{N}]?+\p{L}+|\p{N}| ?[^\s\p{L}\p{N}]++[\r\n]*|\s*[\r\n]|\s+(?!\S)|\s+'
Regex error: regex_error(error_badrepeat): One of *?+{ was not preceded by a valid regular expression.
[2025-04-03 04:04:09][DEBUG] llama_model_load: error loading model: error loading model vocabulary: Failed to process regex
llama_model_load_from_file_impl: failed to load model
[2025-04-03 04:04:09][DEBUG] common_init_from_params: failed to load model 'D:\AI_Models\mradermacher\Ling-lite-GGUF\Ling-lite.Q4_K_M.gguf'
[2025-04-03 04:04:09][DEBUG] lmstudio-llama-cpp: failed to load model. Error: error loading model: error loading model vocabulary: Failed to process regex
Again, this happened using the llamacpp backend which supposedly already has that fix for this type of model.
From my LM Studio, please note the Release notes of the Runtime, the commit which supposedly fixes Ling models is already included in the Runtime.
[2025-04-03 04:04:09][DEBUG] Failed to process regex: ''(?:[sSdDmMtT]|[lL][lL]|[vV][eE]|[rR][eE])|[^\r\n\p{L}\p{N}]?+\p{L}+|\p{N}| ?[^\s\p{L}\p{N}]++[\r\n]|\s[\r\n]|\s+(?!\S)|\s+'
Regex error: regex_error(error_badrepeat): One of *?+{ was not preceded by a valid regular expression.
[2025-04-03 04:04:09][DEBUG] llama_model_load: error loading model: error loading model vocabulary: Failed to process regex
This is clearly not using the fixed version as that is the old regex!
Again, this happened using the llamacpp backend which supposedly already has that fix for this type of model.
From my LM Studio, please note the Release notes of the Runtime, the commit which supposedly fixes Ling models is already included in the Runtime.
The b5002 release is too old, you need b5026 or newer.
[2025-04-03 04:04:09][DEBUG] Failed to process regex: ''(?:[sSdDmMtT]|[lL][lL]|[vV][eE]|[rR][eE])|[^\r\n\p{L}\p{N}]?+\p{L}+|\p{N}| ?[^\s\p{L}\p{N}]++[\r\n]|\s[\r\n]|\s+(?!\S)|\s+'
Regex error: regex_error(error_badrepeat): One of *?+{ was not preceded by a valid regular expression.
[2025-04-03 04:04:09][DEBUG] llama_model_load: error loading model: error loading model vocabulary: Failed to process regexThis is clearly not using the fixed version as that is the old regex!
Again, this happened using the llamacpp backend which supposedly already has that fix for this type of model.
From my LM Studio, please note the Release notes of the Runtime, the commit which supposedly fixes Ling models is already included in the Runtime.The b5002 release is too old, you need b5026 or newer.
Ah, I see. Thanks. However, I don't think I can simply update the files manually, can I? LM Studio runtimes may contain some additional files not supplied by llamacpp, if I'm not mistaken.
After these two threads practically describing it all in detail, I thought everyone would know what I'm talking about
You didn't link to these reports, so how would anybody know you are talking about those? Especially since you gave completely false information claiming it was after the fix, when your software version was simply outdated.
And yes, it's probably not easy to upgrade lm studio's llama.cpp version yourself, you'd probably have to wait for a fix from the lm studio developers.
After these two threads practically describing it all in detail, I thought everyone would know what I'm talking about
You didn't link to these reports, so how would anybody know you are talking about those? Especially since you gave completely false information claiming it was after the fix, when your software version was simply outdated.
And yes, it's probably not easy to upgrade lm studio's llama.cpp version yourself, you'd probably have to wait for a fix from the lm studio developers.
This is the beta version of LM Studio containing the latest version of the LM Studio, but unfortunately not always containing the latest version of the llamacpp runtime itself. As you can see from the screenshot the release notes there did mention support for Ling models, so I was under impression this runtime already contains the fixes required for this model to load properly. It was a honest mistake, but I apologize to anyone who honestly feels like their time here was not spent well, because I did not intend to mislead anyone. If you never made similar mistake at any point in your life, good for you.