7 8 37

Mitko Vasilev

mitkox

AI & ML interests

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.

Recent Activity

liked a model 4 days ago

nomic-ai/colnomic-embed-multimodal-7b

liked a dataset 5 days ago

virtuoussy/Multi-subject-RLVR

liked a model 11 days ago

Qwen/Qwen2.5-Omni-7B

View all activity

Organizations

mitkox's activity

liked a model 4 days ago

nomic-ai/colnomic-embed-multimodal-7b

Visual Document Retrieval • Updated 2 days ago • 1.39k • 23

liked a dataset 5 days ago

virtuoussy/Multi-subject-RLVR

Viewer • Updated 4 days ago • 579k • 519 • 42

liked a model 11 days ago

Qwen/Qwen2.5-Omni-7B

Any-to-Any • Updated 6 days ago • 90k • 1.21k

liked a model 13 days ago

deepseek-ai/DeepSeek-V3-0324

Text Generation • Updated 10 days ago • 158k • • 2.37k

liked a model 30 days ago

unsloth/QwQ-32B-GGUF

Text Generation • Updated 27 days ago • 108k • 70

liked a model about 1 month ago

Qwen/QwQ-32B

Text Generation • Updated 26 days ago • 839k • • 2.63k

liked a dataset about 1 month ago

PrimeIntellect/SYNTHETIC-1

Viewer • Updated Feb 21 • 1.99M • 2.23k • 48

liked a dataset about 2 months ago

open-r1/OpenR1-Math-Raw

Viewer • Updated Feb 24 • 516k • 1.22k • 72

upvoted 2 collections 2 months ago

Qwen2.5-VL

Collection

Vision-language model series based on Qwen2.5 • 11 items • Updated 6 days ago • 436

Qwen2.5-1M

Collection

The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated Feb 26 • 111

liked a model 2 months ago

mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.0

Text Generation • Updated Jan 29 • 43 • 44

updated a model 2 months ago

mitkox/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.0-Q4_K_M-GGUF

Text Generation • Updated Jan 24 • 7 • 1

published a model 2 months ago

mitkox/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.0-Q4_K_M-GGUF

Text Generation • Updated Jan 24 • 7 • 1

posted an update 2 months ago

Post

2712

llama.cpp is 26.8% faster than ollama.
I have upgraded both, and using the same settings, I am running the same DeepSeek R1 Distill 1.5B on the same hardware. It's an Apples to Apples comparison.

Total duration:
llama.cpp 6.85 sec <- 26.8% faster
ollama 8.69 sec

Breakdown by phase:
Model loading
llama.cpp 241 ms <- 2x faster
ollama 553 ms

Prompt processing
llama.cpp 416.04 tokens/s with an eval time 45.67 ms <- 10x faster
ollama 42.17 tokens/s with an eval time of 498 ms

Token generation
llama.cpp 137.79 tokens/s with an eval time 6.62 sec <- 13% faster
ollama 122.07 tokens/s with an eval time 7.64 sec

llama.cpp is LLM inference in C/C++; ollama adds abstraction layers and marketing.

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.

7 replies

liked a dataset 2 months ago

cais/hle

Viewer • Updated 2 days ago • 2.5k • 7.17k • 293

posted an update 2 months ago

Post

564

Stargate to the west of me
DeepSeek to the east
Here I am
Stuck in the middle with the EU

It will likely be a matter of sparkle to get export control on frontier research and models on both sides, leaving us in a vacuum.

Decentralized training infrastructure and on device inferencing are the future.

posted an update 3 months ago

Post

529

On device AI reasoning ODA-R using speculative decoding with draft model DeepSeek-R1-Distill-Qwen-1.5B and DeepSeek-R1-Distill-Qwen-32B. DSPy compiler for reasoning prompts in math, engineering, code...

updated a model 3 months ago

mitkox/DeepSeek-R1-Distill-Llama-8B-Q4_K_M-GGUF

Updated Jan 20 • 185

published a model 3 months ago

mitkox/DeepSeek-R1-Distill-Llama-8B-Q4_K_M-GGUF

Updated Jan 20 • 185

liked a model 3 months ago

deepseek-ai/DeepSeek-R1-Zero

Text Generation • Updated 10 days ago • 7.2k • 890