|
# Benchmarking models |
|
|
|
To use `bench-TriLMs.sh`, you need to |
|
|
|
- Place it in a `llama.cpp` checkout |
|
- Have `cmake`, `gcc`, and other dependencies of `llama.cpp` |
|
- If you want to benchmark on GPUs, the script checks if `nvidia-smi` is present, and you'll also need the necessary compile-time dependencies |
|
|
|
The script will automatically download the models and quantize different variants. |
|
|
|
It will then produce 2 result files, one called `results-$(date +%s).json` and the other called `results-$(date +%s)-cpuinfo.txt`. Both will use the exact same date. |
|
|
|
The intention is to eventually read the produced `.json` in a Python script with |
|
|
|
```python3 |
|
from __future__ import annotations |
|
|
|
from typing import Any |
|
import json |
|
|
|
with open("result-1234567890.json") as f: |
|
data: list[list[dict[str, Any]]] = json.loads("[" + f.read() + "]") |
|
|
|
# Then use that data |
|
... |
|
``` |
|
|