Nice! I'm working on a CPU-only inference layer and been looking for a good and simple alternative to work on.
I would love it if there can be q8 and q4 quantization support!

replied to takarajordan's post 6 days ago

What is the performance like compared to ggml and llama.cpp?
Also do it support quantization?

Thanks

updated a model 7 days ago

ThomasTheMaker/Luna-bin-files

Text Generation • Updated 7 days ago

published a model 13 days ago

ThomasTheMaker/Luna-bin-files

Text Generation • Updated 7 days ago

liked a model 2 months ago

PleIAs/Pleias-Nano

Updated Dec 5, 2024 • 156 • 37

replied to Pendrokar's post 2 months ago

Is it possible to have the decoder and encoder separate for this model? I would like to use my RK3588 NPU for decoder. thanks!

liked a model 3 months ago

DevQuasar/facebook.layerskip-llama3.2-1B-GGUF

Text Generation • Updated Feb 1 • 330 • 2

published a model 3 months ago

ThomasTheMaker/deepseek-r1-1.5b-q4-llamafile

Updated Jan 22

liked 2 models 3 months ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated 5 days ago • 2.02M • 4.03k

prithivMLmods/SmolLM2-CoT-360M-GGUF

Text Generation • Updated Jan 5 • 181 • 9

reacted to prithivMLmods's post with 🚀 3 months ago

Post

6016

Reasoning SmolLM2 🚀

🎯Fine-tuning SmolLM2 on a lightweight synthetic reasoning dataset for reasoning-specific tasks. Future updates will focus on lightweight, blazing-fast reasoning models. Until then, check out the blog for fine-tuning details.

🔥Blog : https://huggingface.co/blog/prithivMLmods/smollm2-ft

🔼 Models :
+ SmolLM2-CoT-360M : prithivMLmods/SmolLM2-CoT-360M
+ Reasoning-SmolLM2-135M : prithivMLmods/Reasoning-SmolLM2-135M
+ SmolLM2-CoT-360M-GGUF : prithivMLmods/SmolLM2-CoT-360M-GGUF

🤠 Other Details :
+ Demo : prithivMLmods/SmolLM2-CoT-360M
+ Fine-tune nB : prithivMLmods/SmolLM2-CoT-360M