[FEEDBACK] Inference Providers

#49
by julien-c - opened
Hugging Face org

Any inference provider you love, and that you'd like to be able to access directly from the Hub?

Hugging Face org
โ€ข
edited Jan 28

Love that I can call DeepSeek R1 directly from the Hub ๐Ÿ”ฅ

from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="together",
    api_key="xxxxxxxxxxxxxxxxxxxxxxxx"
)

messages = [
    {
        "role": "user",
        "content": "What is the capital of France?"
    }
]

completion = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1", 
    messages=messages, 
    max_tokens=500
)

print(completion.choices[0].message)

Is it possible to set a monthly payment budget or rate limits for all the external providers? I don't see such options in billings tab. In case a key is or session token is stolen, it can be quite dangerous to my thin wallet:(

Hugging Face org

@benhaotang you already get spending notifications when crossing important thresholds ($10, $100, $1,000) but we'll add spending limits in the future

@benhaotang you already get spending notifications when crossing important thresholds ($10, $100, $1,000) but we'll add spending limits in the future

Thanks for your quick reply, good to know!

Would be great if you could add Nebius AI Studio to the list :) New inference provider on the market, with the absolute cheapest prices and the highest rate limits...

Could be good to add featherless.ai

TitanML !!

Hugging Face org

@alexman83 Merve ( @merve ) opened https://github.com/huggingface/smolagents/pull/1260 which will expose the bill_to param in smolagents' InferenceClient ๐Ÿ”ฅ

Hugging Face org

(you'll need to upgrade your smolagents version)

@julien-c We just started integrating us WaveSpeedAI http://wavespeed.ai as the inference provider. We provide blazing fast image and video generation services. Happy to see we are listed!

@julien-c why is wavespeed on the list? they serve like 2 models. runware is faster and better :)

Would be great if you could add Nebius AI Studio to the list :) New inference provider on the market, with the absolute cheapest prices and the highest rate limits...

We have no rate limits at nCompass (https://app.compass.tech) with fast inference. Feel free to give it a spin if rate limits are something that's a bottleneck for you.

we want to be an inference provider and submit pr(https://github.com/huggingface/huggingface.js/pull/1424). Can anyone please help me see

I'm getting a little mad... The private servers aren't working since hours!!! The ZeroGPU spaces freeze out with "worker error" since WEEKS!!! And the dev team aren't urged to fix them... For least a week (during EASTER) the Stable Diffusion picture generators produced pics in APPLE II (80's) quality!!! At least they fix them, but we're losing our patience... OK, the (credit-based) interface version of SD3.5 is working, but that's a funny excuse to not fix the others... Actually, the admins are REQUIRED to fix these errors by law... Yes, you can say that WE may fix it, but many of us aren't senior programmers, and in fact, it's forbidden for average users to fix the API, as the problem is (almost) never local... So, I officially demand the developers to fix the HF API (affecting the spaces) by tomorrow morning (EU time)!!! I really don't want to file a case against you in court...

I am sad, too ๐Ÿฅน๐Ÿ’”๐Ÿ’”

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment