ohjuny's picture
TGI/vLLM benchmarking (#34)
c5e73ca unverified
|
raw
history blame
463 Bytes

About

This directory contains a script for running benchmarks (including energy comsumption) on models that are hosted on a dedicated inference server. The script is taken and modified from vllm

The current script supports TGI and vLLM. Before running the benchmark script, the inference server hosting the relevant model should be hosted.