philipp-zettl (PZ)

reacted to marksverdhei's post with 🤗 20 days ago

Post

2645

Dear Hugging Face team, can we please have a way to archive hf repositories / spaces? I have a bunch of spaces that used to work but don't any more due to the hf space implementations changing and i think it would be good if I could archive those like in GitHub.

React to this post if you want to see this feature! 💡

reacted to codelion's post with 🔥 about 2 months ago

Post

6122

Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models!

Key findings from our research on optimal architectures for small language models:

→ Depth beats width: 32 layers outperforms 12 layers at the same parameter count
→ Best-in-class factuality: 47.5% on TruthfulQA
→ 10x training efficiency using WSD (Warmup-Stable-Decay) conversion
→ Canon layers add only 0.13% parameters but improve reasoning

We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens.

Blog: https://huggingface.co/blog/codelion/optimal-model-architecture
Model: codelion/dhara-70m

1 reply

·

reacted to tomaarsen's post with ❤️ 4 months ago

Post

4484

🤗 Sentence Transformers is joining Hugging Face! 🤗 This formalizes the existing maintenance structure, as I've personally led the project for the past two years on behalf of Hugging Face! Details:

Today, the Ubiquitous Knowledge Processing (UKP) Lab is transferring the project to Hugging Face. Sentence Transformers will remain a community-driven, open-source project, with the same open-source license (Apache 2.0) as before. Contributions from researchers, developers, and enthusiasts are welcome and encouraged. The project will continue to prioritize transparency, collaboration, and broad accessibility.

Read our full announcement for more details and quotes from UKP and Hugging Face leadership: https://huggingface.co/blog/sentence-transformers-joins-hf

We see an increasing wish from companies to move from large LLM APIs to local models for better control and privacy, reflected in the library's growth: in just the last 30 days, Sentence Transformer models have been downloaded >270 million times, second only to transformers.

I would like to thank the UKP Lab, and especially Nils Reimers and Iryna Gurevych, both for their dedication to the project and for their trust in myself, both now and two years ago. Back then, neither of you knew me well, yet you trusted me to take the project to new heights. That choice ended up being very valuable for the embedding & Information Retrieval community, and I think this choice of granting Hugging Face stewardship will be similarly successful.

I'm very excited about the future of the project, and for the world of embeddings and retrieval at large!

1 reply

·

reacted to sergiopaniego's post with 🔥 5 months ago

Post

3009

A few days ago, Thinking Machines Lab released “LoRA Without Regret”, showing that LoRA can match full fine-tuning performance when configured right.

Naturally, we decided to reproduce the results with TRL and release a guide!

https://huggingface.co/docs/trl/main/en/lora_without_regret

reacted to yjernite's post with 🤗 7 months ago

Post

4261

𝗙𝗶𝗿𝘀𝘁 𝗚𝗣𝗔𝗜 𝗠𝗼𝗱𝗲𝗹 𝘄𝗶𝘁𝗵 𝗘𝗨 𝗗𝗮𝘁𝗮 𝗧𝗿𝗮𝗻𝘀𝗽𝗮𝗿𝗲𝗻𝗰𝘆 𝗧𝗲𝗺𝗽𝗹𝗮𝘁𝗲? 🇪🇺

With the release of the EU data transparency template this week, we finally got to see one of the most meaningful artifacts to come out of the AI Act implementation so far (haven't you heard? AI's all about the data! 📊📚)

The impact of the template will depend on how effectively it establishes a minimum meaningful transparency standard for companies that don't otherwise offer any transparency into their handling of e.g. personal data or (anti?-)competitive practices in commercial licensing - we'll see how those play out as new models are released after August 2nd 👀

In the meantime, I wanted to see how the template works for a fully open-source + commercially viable model, so I filled it out for the SmolLM3 - which my colleagues at Hugging Face earlier this month 🤗 ICYMI, it's fully open-source with 3B parameters and performance matching the best similar-size models (I've switched all my local apps from Qwen3 to it, you should too 💡)

Verdict: congrats to the European Commission AI Office for making it so straightforward! Fully open and transparent models remain a cornerstone of informed regulation and governance, but the different organizational needs of their developers aren't always properly accounted for in new regulation. In this case, it took me all of two hours to fill out and publish the template (including reading the guidelines) - so kudos for making it feasible for smaller and distributed organizations 🙌 Definitely a step forward for transparency 🔍

To learn more have a look at:

- The SmolLM3 model: HuggingFaceTB/SmolLM3-3B
- Its filled out Public Summary of Training Content: hfmlsoc/smollm3-eu-data-transparency
- And if you're interested, some previous remarks on regulatory minimum meaningful standards for data disclosure: https://huggingface.co/blog/yjernite/naiac-data-transparency

reacted to schuler's post with 🔥 about 1 year ago

Post

7270

📢 New Research Alert: Making Language Models Smaller & Smarter!

Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance.

The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.

🔑 Key Findings:
• 77% parameter reduction.
• Maintained model capabilities.
• Improved generalization.

Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT
Code: https://github.com/joaopauloschuler/less-parameters-llm

2 replies

·

reacted to hexgrad's post with 🔥 about 1 year ago

Post

8858

hexgrad/Kokoro-82M got an upgrade! ⬆️ More voices, more languages, pip install kokoro, and still 82M parameters.

GitHub: https://github.com/hexgrad/kokoro
PyPI: https://pypi.org/project/kokoro/
Space: hexgrad/Kokoro-TTS

11 replies

·

reacted to mitkox's post with 🚀 about 1 year ago

Post

3837

llama.cpp is 26.8% faster than ollama.
I have upgraded both, and using the same settings, I am running the same DeepSeek R1 Distill 1.5B on the same hardware. It's an Apples to Apples comparison.

Total duration:
llama.cpp 6.85 sec <- 26.8% faster
ollama 8.69 sec

Breakdown by phase:
Model loading
llama.cpp 241 ms <- 2x faster
ollama 553 ms

Prompt processing
llama.cpp 416.04 tokens/s with an eval time 45.67 ms <- 10x faster
ollama 42.17 tokens/s with an eval time of 498 ms

Token generation
llama.cpp 137.79 tokens/s with an eval time 6.62 sec <- 13% faster
ollama 122.07 tokens/s with an eval time 7.64 sec

llama.cpp is LLM inference in C/C++; ollama adds abstraction layers and marketing.

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.

7 replies

·

reacted to fdaudens's post with ❤️ about 1 year ago

Post

9837

Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:

- Original release: 8 models, 540K downloads. Just the beginning...

- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5M—nearly 5X the originals.

The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.

When you empower builders, innovation explodes. For everyone. 🚀

The most popular community model? @bartowski 's DeepSeek-R1-Distill-Qwen-32B-GGUF version — 1M downloads alone.

5 replies

·

reacted to lewtun's post with 🚀 about 1 year ago

Post

7087

We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute 🔥

How? By combining step-wise reward models with tree search algorithms :)

We show that smol models can match or exceed the performance of their much larger siblings when given enough "time to think"

We're open sourcing the full recipe and sharing a detailed blog post.

In our blog post we cover:

📈 Compute-optimal scaling: How we implemented DeepMind's recipe to boost the mathematical capabilities of open models at test-time.

🎄 Diverse Verifier Tree Search (DVTS): An unpublished extension we developed to the verifier-guided tree search technique. This simple yet effective method improves diversity and delivers better performance, particularly at large test-time compute budgets.

🧭 Search and Learn: A lightweight toolkit for implementing search strategies with LLMs and built for speed with vLLM

Here's the links:

- Blog post: HuggingFaceH4/blogpost-scaling-test-time-compute

- Code: https://github.com/huggingface/search-and-learn

Enjoy!

2 replies

·

reacted to lhoestq's post with ❤️ about 1 year ago

Post

3119

Made a HF Dataset editor a la gg sheets here: lhoestq/dataset-spreadsheets

With Dataset Spreadsheets:
✏️ Edit datasets in the UI
🔗 Share link with collaborators
🐍 Use locally in DuckDB or Python

Available for the 100,000+ parquet datasets on HF :)

reacted to merve's post with ❤️ about 1 year ago

Post

5745

This week in open-source AI was insane 🤠 A small recap🕺🏻 merve/dec-6-releases-67545caebe9fc4776faac0a3

Multimodal 🖼️
> Google shipped a PaliGemma 2, new iteration of PaliGemma with more sizes: 3B, 10B and 28B, with pre-trained and captioning variants 👏
> OpenGVLab released InternVL2, seven new vision LMs in different sizes, with sota checkpoint with MIT license ✨
> Qwen team at Alibaba released the base models of Qwen2VL models with 2B, 7B and 72B ckpts

LLMs 💬
> Meta released a new iteration of Llama 70B, Llama3.2-70B trained further
> EuroLLM-9B-Instruct is a new multilingual LLM for European languages with Apache 2.0 license 🔥
> Dataset: CohereForAI released GlobalMMLU, multilingual version of MMLU with 42 languages with Apache 2.0 license
> Dataset: QwQ-LongCoT-130K is a new dataset to train reasoning models
> Dataset: FineWeb2 just landed with multilinguality update! 🔥 nearly 8TB pretraining data in many languages!

Image/Video Generation 🖼️
> Tencent released HunyuanVideo, a new photorealistic video generation model
> OminiControl is a new editing/control framework for image generation models like Flux

Audio 🔊
> Indic-Parler-TTS is a new text2speech model made by community

reacted to christopher's post with 🔥 about 1 year ago

Post

2452

The Lichess database of games, puzzles, and engine evaluations is now on the Hub:

Lichess

Billions of chess data points to download, query, and stream and we're excited to see what you'll build with it! ♟️ 🤗

- https://huggingface.co/collections/Lichess/positions-datasets-66f50837db5cd3287d60d489
- https://huggingface.co/collections/Lichess/games-datasets-66f508df78f4b43e1bb2d353

posted an update about 1 year ago

Post

1170

alias rm='rm -i'

Better be safe than sorry.

reacted to andito's post with 🔥 about 1 year ago

Post

2007

SmolVLM speeding locally on a laptop thanks to mlx-vlm and
@Gradio ! Try it with two lines:
pip install git+https://github.com/andimarafioti/mlx-vlm.git@stream-generate-fix
python -m mlx_vlm.chat_ui --model mlx-community/SmolVLM-Instruct-8bit

Gotta love the MLX community! Big thanks to @pcuenq and @prince_canuma !

reacted to merve's post with ❤️ about 1 year ago

Post

3387

your hugging face profile now has your recent activities 🤗

replied to their post over 1 year ago

Yikes... good point!

replied to their post over 1 year ago

I think you got me wrong there. I'm mostly concerned about image generation LoRAs that are trained on your person or for instance the pictures of children.
Gate keeping the secret sauce for base models is different and I totally agree with you on that part.

replied to their post over 1 year ago

I'm more concerned about bad actors using them to create content that might harm you or put you in a bad spot by creating visual content with your face.
For instance to blackmail you or harm your reputation.

I am for sure a big supporter of open source and publish all the things I have the rights to. Yet, I wouldn't publish a LoRA that is trained on my face.

posted an update over 1 year ago

Post

1575

This is probably a very hot take, but here goes nothing.

With the incredibly accurate LoRAs we see emerge for high quality models like FLUX from services like fal.ai that offer training within single digit minutes, e.g. 2 min per 1000 iterations.

Why the hell are people publishing private LoRAs as public models?!
Take a look at this listing: https://huggingface.co/models?other=base_model:adapter:black-forest-labs%2FFLUX.1-dev&sort=created

I would expect that people that hold a HF account have some kind of forward thinking. Heck, do you really want to give anyone the power to create ultra realistic images of yourself?!

Didn't we learn anything from social media?
I am puzzled..

6 replies

·

PZ

AI & ML interests

Recent Activity

Organizations

PZ

AI & ML interests

Recent Activity

Organizations

philipp-zettl's activity