Spaces:
Running
Running
This folder contains the suite for evaluating the DuckDB-Text2SQL model. | |
Please install the dependencies listed in the requirements.txt file located in the parent folder. | |
## Setup | |
To evaluate against the benchmark dataset, you need to prepare the evaluation script using this benchmark. | |
``` | |
mkdir metrics | |
cd metrics | |
git clone [email protected]:ElementAI/test-suite-sql-eval.git test_suite_sql_eval | |
cd .. | |
``` | |
You need to add a new remote to evaluate against duckdb in the test-suite-sql-eval folder. And check the latest duckdb-only branch (640a12975abf75a94e917caca149d56dbc6bcdd7). | |
``` | |
git remote add till https://github.com/tdoehmen/test-suite-sql-eval.git | |
git fetch till | |
git checkout till/duckdb-only | |
``` | |
Next, prepare the docs for retrieval. | |
``` | |
mkdir docs | |
cd docs | |
git clone https://github.com/duckdb/duckdb-web.git | |
cd .. | |
``` | |
#### Dataset | |
The benchmark dataset is located in the `data/` folder and includes all databases (`data/databases`), table schemas (`data/tables.json`), and examples (`data/dev.json`). | |
#### Eval | |
Start a manifest session with the model you want to evaluate. | |
```bash | |
python -m manifest.api.app \ | |
--model_type huggingface \ | |
--model_generation_type text-generation \ | |
--model_name_or_path motherduckdb/DuckDB-NSQL-7B-v0.1 \ | |
--fp16 \ | |
--device 0 | |
``` | |
Then, from the `DuckDB-NSQL` main folder, run: | |
```bash | |
python eval/predict.py \ | |
predict \ | |
eval/data/dev.json \ | |
eval/data/tables.json \ | |
--output-dir output/ \ | |
--stop-tokens ';' \ | |
--stop-tokens '--' \ | |
--stop-tokens '```' \ | |
--stop-tokens '###' \ | |
--overwrite-manifest \ | |
--manifest-client huggingface \ | |
--manifest-connection http://localhost:5000 \ | |
--prompt-format duckdbinst | |
``` | |
This will format the prompt using the duckdbinst style. | |
To evaluate the prediction, first run the following in a Python shell: | |
```python | |
try: | |
import duckdb | |
con = duckdb.connect() | |
con.install_extension("httpfs") | |
con.load_extension("httpfs") | |
except Exception as e: | |
print(f"Error loading duckdb extensions: {e}") | |
``` | |
Then, run the evaluation script: | |
```bash | |
python eval/evaluate.py \ | |
evaluate \ | |
--gold eval/data/dev.json \ | |
--db eval/data/databases/ \ | |
--tables eval/data/tables.json \ | |
--output-dir output/ \ | |
--pred [PREDICITON_FILE] | |
``` | |
To view the output, all the information is located in the prediction file in the [output-dir]. Here, `query` is gold and `pred` is predicted. | |