Error when deploying this model via Inference Endpoints

#5
by edgelesssys - opened

When deploying this model via inference endpoints in HuggingFace, the following error occurs:

Exit code: 1. Reason: rn loop.run_until_complete(main)

  File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()

  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 229, in serve_inner
    model = get_model_with_lora_adapters(

  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init__.py", line 1152, in get_model_with_lora_adapters
    model = get_model(

  File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/__init__.py", line 487, in get_model
    if max_input_tokens is not None and max_input_tokens <= sliding_window:

TypeError: '<=' not supported between instances of 'int' and 'NoneType'
 rank=3
2024-08-07T15:38:14.094704Z ERROR text_generation_launcher: Shard 1 failed to start
2024-08-07T15:38:14.094724Z  INFO text_generation_launcher: Shutting down shards
2024-08-07T15:38:14.096975Z  INFO shard-manager: text_generation_launcher: Terminating shard rank=0
2024-08-07T15:38:14.097001Z  INFO shard-manager: text_generation_launcher: Waiting for shard to gracefully shutdown rank=0
2024-08-07T15:38:14.097313Z  INFO shard-manager: text_generation_launcher: Terminating shard rank=2
2024-08-07T15:38:14.097335Z  INFO shard-manager: text_generation_launcher: Waiting for shard to gracefully shutdown rank=2
2024-08-07T15:38:14.197291Z  INFO shard-manager: text_generation_launcher: shard terminated rank=0
2024-08-07T15:38:14.197578Z  INFO shard-manager: text_generation_launcher: shard terminated rank=2
Error: ShardCannotStart

Sign up or log in to comment