Hugging Face Optimum

non-profit

https://huggingface.co/docs/optimum

Activity Feed Request to join this org

AI & ML interests

Accelerating DL

Recent Activity

Jingya updated a model 6 days ago

optimum/pixart_sigma_pipe_xl_2_512_ms_neuronx

Jingya published a model 6 days ago

optimum/pixart_sigma_pipe_xl_2_512_ms_neuronx

Jingya updated a model 7 days ago

optimum/bge-base-en-v1.5-neuronx

View all activity

optimum's activity

Jingya

updated a model 6 days ago

optimum/pixart_sigma_pipe_xl_2_512_ms_neuronx

Updated 6 days ago

Jingya

published a model 6 days ago

optimum/pixart_sigma_pipe_xl_2_512_ms_neuronx

Updated 6 days ago

Jingya

updated 2 models 7 days ago

optimum/bge-base-en-v1.5-neuronx

Feature Extraction • Updated 7 days ago • 4

optimum/clip_vit_emb_neuronx

Updated 7 days ago • 9

Jingya

published a model 7 days ago

optimum/clip_vit_emb_neuronx

Updated 7 days ago • 9

jeffboudier

posted an update 8 days ago

Post

2067

Llama4 is out and Scout is already on the Dell Enterprise Hub to deploy on Dell systems 👉 dell.huggingface.co

Jingya

updated a model 9 days ago

optimum/clip-vit-base-patch32-image-classification-neuronx

Updated 9 days ago • 6

Jingya

published a model 9 days ago

optimum/clip-vit-base-patch32-image-classification-neuronx

Updated 9 days ago • 6

Jingya

updated a model 10 days ago

optimum/clip-vit-base-patch32-neuronx

Updated 10 days ago • 2

Jingya

published a model 10 days ago

optimum/clip-vit-base-patch32-neuronx

Updated 10 days ago • 2

jeffboudier

posted an update 11 days ago

Post

1472

Enterprise orgs now enable serverless Inference Providers for all members
- includes $2 free usage per org member (e.g. an Enterprise org with 1,000 members share $2,000 free credit each month)
- admins can set a monthly spend limit for the entire org
- works today with Together, fal, Novita, Cerebras and HF Inference.

Here's the doc to bill Inference Providers usage to your org: https://huggingface.co/docs/inference-providers/pricing#organization-billing

2 replies

Jingya

updated 2 models 11 days ago

optimum/yolos-tiny-neuronx-bs1

Updated 11 days ago • 55

optimum/bert-base-cased-swag-neuronx

Updated 11 days ago • 11

regisss

posted an update about 2 months ago

Post

1701

Nice paper comparing the fp8 inference efficiency of Nvidia H100 and Intel Gaudi2: An Investigation of FP8 Across Accelerators for LLM Inference (2502.01070)

The conclusion is interesting: "Our findings highlight that the Gaudi 2, by leveraging FP8, achieves higher throughput-to-power efficiency during LLM inference"

One aspect of AI hardware accelerators that is often overlooked is how they consume less energy than GPUs. It's nice to see researchers starting carrying out experiments to measure this!

Gaudi3 results soon...

pagezyhf

posted an update 2 months ago

Post

1739

We published https://huggingface.co/blog/deepseek-r1-aws!

If you are using AWS, give a read. It is a running document to showcase how to deploy and fine-tune DeepSeek R1 models with Hugging Face on AWS.

We're working hard to enable all the scenarios, whether you want to deploy to Inference Endpoints, Sagemaker or EC2; with GPUs or with Trainium & Inferentia.

We have full support for the distilled models, DeepSeek-R1 support is coming soon!! I'll keep you posted.

Cheers

1 reply

pagezyhf

posted an update 3 months ago

Post

455

Learn how to deploy multiple LoRA adapters on Vertex AI with this blogpost, using Hugging Face Deep Learning Containers on GCP.

https://medium.com/google-cloud/open-models-on-vertex-ai-with-hugging-face-serving-multiple-lora-adapters-on-vertex-ai-e3ceae7b717c

jeffboudier

posted an update 3 months ago

Post

742

NVIDIA just announced the Cosmos World Foundation Models, available on the Hub: nvidia/cosmos-6751e884dc10e013a0a0d8e6

Cosmos is a family of pre-trained models purpose-built for generating physics-aware videos and world states to advance physical AI development.
The release includes Tokenizers nvidia/cosmos-tokenizer-672b93023add81b66a8ff8e6

Learn more in this great community article by @mingyuliutw and @PranjaliJoshi https://huggingface.co/blog/mingyuliutw/nvidia-cosmos

1 reply

regisss

posted an update 4 months ago

Post

1030

Nice to see day 1 support of Falcon 3 on Gaudi with Optimum Habana!

👉 https://www.intel.com/content/www/us/en/developer/articles/technical/intel-ai-solutions-support-falcon-3-fdn-models.html

pagezyhf

posted an update 4 months ago

Post

375

Today you are able to access some of the most famous models from the Hugging Face community in Amazon Bedrock 🤯

Amazon Bedrock expands its model catalog with Bedrock Marketplace to hundreds of specialized models.

https://us-west-2.console.aws.amazon.com/bedrock/home?region=us-west-2#/model-catalog

pagezyhf

posted an update 4 months ago

Post

982

It’s 2nd of December , here’s your Cyber Monday present 🎁 !

We’re cutting our price down on Hugging Face Inference Endpoints and Spaces!

Our folks at Google Cloud are treating us with a 40% price cut on GCP Nvidia A100 GPUs for the next 3️⃣ months. We have other reductions on all instances ranging from 20 to 50%.

Sounds like the time to give Inference Endpoints a try? Get started today and find in our documentation the full pricing details.
https://ui.endpoints.huggingface.co/
https://huggingface.co/pricing

AI & ML interests

Recent Activity

Team members 15

optimum's activity