Inference endpoint fails to deploy

#13

by dragosmc - opened Dec 12, 2023

Discussion

dragosmc

Dec 12, 2023

•

edited Dec 12, 2023

Hi,

The HF inference endpoint fails to deploy with

 get_model(File \"/opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init__.py\", line 317, in get_model
 raise NotImplementedError(\n\nNotImplementedError: Mixtral models requires flash attention v2, stk and megablocks\n"}

Any thoughts on this?

LE: Another attempt in fails with

raise NotImplementedError(\"Mixtral does not support weight quantization yet.\")\n\nNotImplementedError: Mixtral does not support weight quantization yet.\n"}

philschmid

Dec 12, 2023

•

edited Dec 12, 2023

What instance type, container and config did you use? The default config should work with 2x A100 80GBs or use that link https://ui.endpoints.huggingface.co/new?repository=mistralai%2FMixtral-8x7B-Instruct-v0.1&vendor=aws&region=us-east-1&accelerator=gpu&instance_size=2xlarge&task=text-generation&no_suggested_compute=true&tgi=true&tgi_max_batch_total_tokens=1024000&tgi_max_total_tokens=32000

dragosmc

Dec 12, 2023

•

edited Dec 12, 2023

Gotcha, thanks for the info. I was following the UI and tried with the first available instance type that didn't say "Low Memory". Will try with 2xA100 once I get access to it. Thanks.

dragosmc

Dec 12, 2023

What instance type, container and config did you use? The default config should work with 2x A100 80GBs or use that link https://ui.endpoints.huggingface.co/new?repository=mistralai%2FMixtral-8x7B-Instruct-v0.1&vendor=aws&region=us-east-1&accelerator=gpu&instance_size=2xlarge&task=text-generation&no_suggested_compute=true&tgi=true&tgi_max_batch_total_tokens=1024000&tgi_max_total_tokens=32000

Got acces to 2xA100 and not it doesn't seem to go past this point

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/whoami-v2 (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7f88d9b97b80>: Failed to resolve 'huggingface.co' ([Errno -3] Temporary failure in name resolution)"))

Anything else you reckon I should try?

dragosmc changed discussion status to closed Dec 12, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment