Spaces:
Running
Running
About
This directory contains a script for running benchmarks (including energy comsumption) on models that are hosted on a dedicated inference server. The script is taken and modified from vllm
The current script supports TGI and vLLM. Before running the benchmark script, the inference server hosting the relevant model should be hosted.