File size: 872 Bytes
bc00f19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Benchmarking models

To use `bench-TriLMs.sh`, you need to

- Place it in a `llama.cpp` checkout
- Have `cmake`, `gcc`, and other dependencies of `llama.cpp`
- If you want to benchmark on GPUs, the script checks if `nvidia-smi` is present, and you'll also need the necessary compile-time dependencies

The script will automatically download the models and quantize different variants.

It will then produce 2 result files, one called `results-$(date +%s).json` and the other called `results-$(date +%s)-cpuinfo.txt`. Both will use the exact same date.

The intention is to eventually read the produced `.json` in a Python script with

```python3
from __future__ import annotations

from typing import Any
import json

with open("result-1234567890.json") as f:
    data: list[list[dict[str, Any]]] = json.loads("[" + f.read() + "]")

    # Then use that data
    ...
```