8 35 142

PZ PRO

philipp-zettl

philipp-zettl

AI & ML interests

NLP/CV/Multimodal learning

Recent Activity

reacted to hexgrad's post with 🔥 5 days ago

https://huggingface.co/hexgrad/Kokoro-82M got an upgrade! ⬆️ More voices, more languages, `pip install kokoro`, and still 82M parameters. GitHub: https://github.com/hexgrad/kokoro PyPI: https://pypi.org/project/kokoro/ Space: https://huggingface.co/spaces/hexgrad/Kokoro-TTS

upvoted an article 9 days ago

FineWeb2-C: Help Build Better Language Models in Your Language

reacted to mitkox's post with 🚀 9 days ago

llama.cpp is 26.8% faster than ollama. I have upgraded both, and using the same settings, I am running the same DeepSeek R1 Distill 1.5B on the same hardware. It's an Apples to Apples comparison. Total duration: llama.cpp 6.85 sec <- 26.8% faster ollama 8.69 sec Breakdown by phase: Model loading llama.cpp 241 ms <- 2x faster ollama 553 ms Prompt processing llama.cpp 416.04 tokens/s with an eval time 45.67 ms <- 10x faster ollama 42.17 tokens/s with an eval time of 498 ms Token generation llama.cpp 137.79 tokens/s with an eval time 6.62 sec <- 13% faster ollama 122.07 tokens/s with an eval time 7.64 sec llama.cpp is LLM inference in C/C++; ollama adds abstraction layers and marketing. Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.

View all activity

Organizations

philipp-zettl's activity

reacted to hexgrad's post with 🔥 5 days ago

Post

8115

hexgrad/Kokoro-82M got an upgrade! ⬆️ More voices, more languages, pip install kokoro, and still 82M parameters.

GitHub: https://github.com/hexgrad/kokoro
PyPI: https://pypi.org/project/kokoro/
Space: hexgrad/Kokoro-TTS

11 replies

upvoted an article 9 days ago

Article

FineWeb2-C: Help Build Better Language Models in Your Language

•

Dec 23, 2024

• 18

reacted to mitkox's post with 🚀 9 days ago

Post

2142

llama.cpp is 26.8% faster than ollama.
I have upgraded both, and using the same settings, I am running the same DeepSeek R1 Distill 1.5B on the same hardware. It's an Apples to Apples comparison.

Total duration:
llama.cpp 6.85 sec <- 26.8% faster
ollama 8.69 sec

Breakdown by phase:
Model loading
llama.cpp 241 ms <- 2x faster
ollama 553 ms

Prompt processing
llama.cpp 416.04 tokens/s with an eval time 45.67 ms <- 10x faster
ollama 42.17 tokens/s with an eval time of 498 ms

Token generation
llama.cpp 137.79 tokens/s with an eval time 6.62 sec <- 13% faster
ollama 122.07 tokens/s with an eval time 7.64 sec

llama.cpp is LLM inference in C/C++; ollama adds abstraction layers and marketing.

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.

7 replies

reacted to fdaudens's post with ❤️ 9 days ago

Post

7996

Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:

- Original release: 8 models, 540K downloads. Just the beginning...

- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5M—nearly 5X the originals.

The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.

When you empower builders, innovation explodes. For everyone. 🚀

The most popular community model? @bartowski 's DeepSeek-R1-Distill-Qwen-32B-GGUF version — 1M downloads alone.

4 replies

liked 3 models 9 days ago

updated a model 11 days ago

philipp-zettl/vit-base-patch16-224-in21k-oxford-flowers

Updated 11 days ago • 5

published a model 11 days ago

philipp-zettl/vit-base-patch16-224-in21k-oxford-flowers

Updated 11 days ago • 5

liked 3 models 12 days ago

bytedance-research/UI-TARS-7B-DPO

Image-Text-to-Text • Updated 11 days ago • 19.7k • 116

hexgrad/Kokoro-82M

Text-to-Speech • Updated 4 days ago • 148k • • 2.79k

nateraw/food

Image Classification • Updated May 17, 2022 • 1.85k • 55

liked a dataset 15 days ago

HuggingFaceM4/Docmatix

Viewer • Updated Aug 26, 2024 • 2.55M • 13.4k • 244

upvoted a paper 23 days ago

The GAN is dead; long live the GAN! A Modern GAN Baseline

Paper • 2501.05441 • Published 27 days ago • 87

liked 2 models about 1 month ago

nashikone/iroiroLoRA

Updated 10 days ago • 108

deepseek-ai/DeepSeek-V3

Text Generation • Updated 12 days ago • 984k • • 3.15k

upvoted a paper about 1 month ago

1.58-bit FLUX

Paper • 2412.18653 • Published Dec 24, 2024 • 74

upvoted a paper about 2 months ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 125

liked a model about 2 months ago

answerdotai/ModernBERT-large

Fill-Mask • Updated 21 days ago • 2.05M • 342

reacted to lewtun's post with 🚀 about 2 months ago

Post

6824

We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute 🔥

How? By combining step-wise reward models with tree search algorithms :)

We show that smol models can match or exceed the performance of their much larger siblings when given enough "time to think"

We're open sourcing the full recipe and sharing a detailed blog post.

In our blog post we cover:

📈 Compute-optimal scaling: How we implemented DeepMind's recipe to boost the mathematical capabilities of open models at test-time.

🎄 Diverse Verifier Tree Search (DVTS): An unpublished extension we developed to the verifier-guided tree search technique. This simple yet effective method improves diversity and delivers better performance, particularly at large test-time compute budgets.

🧭 Search and Learn: A lightweight toolkit for implementing search strategies with LLMs and built for speed with vLLM

Here's the links:

- Blog post: HuggingFaceH4/blogpost-scaling-test-time-compute

- Code: https://github.com/huggingface/search-and-learn

Enjoy!

2 replies