9 66 66

Jared Sulzdorf PRO

jsulz

https://www.jsulz.com/

AI & ML interests

Infrastructure, law, policy

Recent Activity

liked a Space 3 days ago

xet-team/repo-graph

posted an update 3 days ago

As https://huggingface.co/xet-team infrastructure begins backing hundreds of repositories on the Hugging Face Hub, we’re getting to put on our researcher hats and peer into the bytes. 👀 🤓 IMO, one of the most interesting ideas Xet storage introduces is a globally shared store of data. When you upload a file through Xet, the contents are split into ~64KB chunks and deduplicated, but what if those same chunks already exist in another repo on the Hub? If we can detect and reuse them, we skip them as well saving time and bandwidth for AI builders. More on how that works here: 🔗 https://huggingface.co/blog/from-chunks-to-blocks#scaling-deduplication-with-aggregation Because of this, different repositories can share bytes we store. That opens up something cool - we can draw a graph of which repos actually share data at the chunk level, where: - Nodes = repositories - Edges = shared chunks - Edge thickness = how much they overlap https://huggingface.co/spaces/xet-team/repo-graph Come find the many BERT islands. Or see how datasets relate in practice, not just in theory. See how libraries or tasks can tie repositories together. You can play around with node size using storage/likes/downloads too. The result is a super fun visualization from @saba9 and @znation that I’ve already lost way too much time to. I'm excited to see how the networks grow as we add more repositories!

liked a model 3 days ago

reach-vb/yolo

View all activity

Organizations

jsulz's activity

liked a Space 3 days ago

Repo Graph

👁

A byte-level map of the Hugging Face Hub

posted an update 3 days ago

Post

612

xet-team infrastructure begins backing hundreds of repositories on the Hugging Face Hub, we’re getting to put on our researcher hats and peer into the bytes. 👀 🤓

IMO, one of the most interesting ideas Xet storage introduces is a globally shared store of data.

When you upload a file through Xet, the contents are split into ~64KB chunks and deduplicated, but what if those same chunks already exist in another repo on the Hub?

If we can detect and reuse them, we skip them as well saving time and bandwidth for AI builders. More on how that works here:
🔗 https://huggingface.co/blog/from-chunks-to-blocks#scaling-deduplication-with-aggregation

Because of this, different repositories can share bytes we store. That opens up something cool - we can draw a graph of which repos actually share data at the chunk level, where:

- Nodes = repositories
- Edges = shared chunks
- Edge thickness = how much they overlap

xet-team/repo-graph

Come find the many BERT islands. Or see how datasets relate in practice, not just in theory. See how libraries or tasks can tie repositories together. You can play around with node size using storage/likes/downloads too.

The result is a super fun visualization from @saba9 and @znation that I’ve already lost way too much time to. I'm excited to see how the networks grow as we add more repositories!

liked a model 3 days ago

reach-vb/yolo

Updated 8 days ago • 121 • 1

updated a Space 3 days ago

Repo Graph

👁

A byte-level map of the Hugging Face Hub

published a Space 3 days ago

Repo Graph

👁

A byte-level map of the Hugging Face Hub

replied to their post 4 days ago

What else would folks find interesting to explore?

Certain model trees? Overlap between a set of datasets?

Anything else?

posted an update 4 days ago

Post

2759

What does it mean when models share the same bytes?

We've investigated some quants and have seen that a considerable portion of quantizations of the same model share the same bytes and can be deduplicated to save considerable upload time for quantizers on the Hub.

This space where we crack open a repo from @bartowski shows we can get significant dedupe xet-team/quantization-dedup

You can get a sense of why by reading this write-up: https://github.com/bartowski1182/llm-knowledge/blob/main/quantization/quantization.md

But what about finetuned models?

Since going into production the

xet-team has migrated hundreds of repositories on the Hub to our storage layer, including classic "pre-Hub" open-source models like FacebookAI/xlm-roberta-large (XLM-R) from

FacebookAI

XLM-R, introduced in 2019, set new benchmarks for multilingual NLP by learning shared representations across 100 languages. It was then fine-tuned on English, Spanish, Dutch, and German, generating language-specific derivations for each - check out the paper here Unsupervised Cross-lingual Representation Learning at Scale (1911.02116)

These finetunes share much of the same architecture and layout as XLM-R with similar training methods and goals. It makes sense that they would share bytes, but it's still fascinating to see.

We put together a similar space to explore these models to see where they overlap - check it out for yourself xet-team/finetune-dedupe

The darker each block in the heatmap, the more the bytes are shared. Clicking on a repos blocks shows all other repos that share blocks.

1 reply

updated 2 Spaces 5 days ago

Quantization Dedup

🚀

A view of dedupe from quants in bartowski/gemma-2-9b-it-GGUF

Finetune Dedupe

🌍

A view of dedupe across the XLM-RoBERTa family of finetunes

posted an update 5 days ago

Post

1972

The Llama 4 release - meta-llama/llama-4-67f0c30d9fe03840bc9d0164 - was a big one for the

xet-team with every model backed by the storage infrastructure of the future for the Hub.

It's been a wild few days, and especially 🤯 to see every tensor file with a Xet logo next to it instead of LFS.

The attached graph shows requests per second to our content-addressed store (CAS) right as the release went live.

yellow = GETs; dashed line = launch time.

You can definitely tell when the community started downloading 👀

h/t to @rajatarya for the graph, the entire Xet crew to bring us to this point, and special shoutout to Rajat, @port8080 , @brianronan , @seanses , and @znation who made sure the bytes kept flying all weekend ⚡️

1 reply

published a Space 6 days ago

Finetune Dedupe

🌍

A view of dedupe across the XLM-RoBERTa family of finetunes

upvoted a collection 6 days ago

Llama 4

Collection

Llama 4 release • 10 items • Updated 7 days ago • 419

updated a collection 6 days ago

Papers I Have Read

Collection

A list of papers that have moved off my reading list • 13 items • Updated 6 days ago

upvoted a paper 6 days ago

Unsupervised Cross-lingual Representation Learning at Scale

Paper • 1911.02116 • Published Nov 5, 2019 • 1

posted an update 7 days ago

Post

3541

Huge week for

xet-team as Llama 4 is the first major model on Hugging Face uploaded with Xet providing the backing! Every byte downloaded comes through our infrastructure.

Using Xet on Hugging Face is the fastest way to download and iterate on open source models and we've proved it with Llama 4 giving a boost of ~25% across all models.

We expect builders on the Hub to see even more improvements, helping power innovation across the community.

With the models on our infrastructure, we can peer in and see how well our dedupe performs across the Llama 4 family. On average, we're seeing ~25% dedupe, providing huge savings to the community who iterate on these state-of-the-art models. The attached image shows a few selected models and how they perform on Xet.

Thanks to the

meta-llama team for launching on Xet!

upvoted an article 7 days ago

Article

Welcome Llama 4 Maverick & Scout on Hugging Face!

and 6 others •

8 days ago

• 141

published an article 8 days ago

Article

Welcome Llama 4 Maverick & Scout on Hugging Face!

and 6 others •

8 days ago

• 141

updated a model 8 days ago

jsulz/random-upload-tests

Updated 8 days ago

reacted to fdaudens's post with 🔥 8 days ago

Post

2194

Did we just drop personalized AI evaluation?! This tool auto-generates custom benchmarks on your docs to test which models are the best.

Most benchmarks test general capabilities, but what matters is how models handle your data and tasks. YourBench helps answer critical questions like:
- Do you really need a hundreds-of-billions-parameter model sledgehammer to crack a nut?
- Could a smaller, fine-tuned model work better?
- How well do different models understand your domain?

Some cool features:
📚 Generates custom benchmarks from your own documents (PDFs, Word, HTML)
🎯 Tests models on real tasks, not just general capabilities
🔄 Supports multiple models for different pipeline stages
🧠 Generate both single-hop and multi-hop questions
🔍 Evaluate top models and deploy leaderboards instantly
💰 Full cost analysis to optimize for your budget
🛠️ Fully configurable via a single YAML file

26 SOTA models tested for question generation. Interesting finding: Qwen2.5 32B leads in question diversity, while smaller Qwen models and Gemini 2.0 Flash offer great value for cost.

You can also run it locally on any models you want.

I'm impressed. Try it out: yourbench/demo