For Inference Providers who have built support for our Billing API (currently: Fal, Novita, HF-Inference – with more coming soon), we've started enabling Pay as you go (=PAYG)
What this means is that you can use those Inference Providers beyond the free included credits, and they're charged to your HF account.
You can see it on this view: any provider that does not have a "Billing disabled" badge, is PAYG-compatible.
I was puzzled by the scope of 🐋DeepSeek🐋 projects, i.e. why they built (then open sourced) so many pieces which are all over their technology stack. Good engineers are minimalists. They build only when they have to.
Then I realized that FP8 should be the main driving force here. So your raw inter-GPU bandwidth is cut in half (H800). But if you compress your data presentation from 16 bits to 8 bits, then the effective throughput of your workload stays unchanged!
The idea is simple but lots of work had to be done. Their v3 technical report will give you a wholistic view (better than reading the code). To summarize, data structure is the foundation to any software. Since FP8 was new and untried, the ecosystem wasn't there. So DeepSeek became the trailblazer. Before cooking your meals, you need to till the land, grow crops, and grind the flour 😅
Super happy to welcome Nvidia as our latest enterprise hub customer. They have almost 2,000 team members using Hugging Face, and close to 20,000 followers of their org. Can't wait to see what they'll open-source for all of us in the coming months!