|
<!-- |
|
title: OpenFactCheck |
|
emoji: ✅ |
|
colorFrom: green |
|
colorTo: purple |
|
sdk: streamlit |
|
app_file: src/openfactcheck/app/app.py |
|
pinned: false |
|
--> |
|
|
|
<p align="center"> |
|
<img alt="OpenFactCheck Logo" src="https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/assets/splash.png" height="120" /> |
|
<p align="center">An Open-source Factuality Evaluation Demo for LLMs |
|
<br> |
|
</p> |
|
</p> |
|
|
|
--- |
|
|
|
<p align="center"> |
|
<a href="https://github.com/hasaniqbal777/OpenFactCheck/actions/workflows/release.yaml"> |
|
<img src="https://img.shields.io/github/actions/workflow/status/hasaniqbal777/openfactcheck/release.yaml?logo=github&label=Release" alt="Release"> |
|
</a> |
|
<a href="https://readthedocs.org/projects/openfactcheck/builds/"> |
|
<img alt="Docs" src="https://img.shields.io/readthedocs/openfactcheck?logo=readthedocs&label=Docs"> |
|
</a> |
|
<br> |
|
<a href="https://opensource.org/licenses/Apache-2.0"> |
|
<img src="https://img.shields.io/github/license/hasaniqbal777/openfactcheck" alt="License: Apache-2.0"> |
|
</a> |
|
<a href="https://pypi.org/project/openfactcheck/"> |
|
<img src="https://img.shields.io/pypi/pyversions/openfactcheck.svg" alt="Python Version"> |
|
</a> |
|
<a href="https://pypi.org/project/openfactcheck/"> |
|
<img src="https://img.shields.io/pypi/v/openfactcheck.svg" alt="PyPI Latest Release"> |
|
</a> |
|
<a href="https://arxiv.org/abs/2405.05583"><img src="https://img.shields.io/badge/arXiv-2405.05583-B31B1B" alt="arXiv"></a> |
|
<a href="https://zenodo.org/doi/10.5281/zenodo.13358664"><img src="https://img.shields.io/badge/DOI-10.5281/zenodo.13358664-blue" alt="DOI"></a> |
|
</p> |
|
|
|
--- |
|
|
|
<p align="center"> |
|
<a href="#overview">Overview</a> • |
|
<a href="#installation">Installation</a> • |
|
<a href="#usage">Usage</a> • |
|
<a href="https://huggingface.co/spaces/hasaniqbal777/OpenFactCheck">HuggingFace Demo</a> • |
|
<a href="https://openfactcheck.readthedocs.io/">Documentation</a> |
|
</p> |
|
|
|
## Overview |
|
|
|
OpenFactCheck is an open-source repository designed to facilitate the evaluation and enhancement of factuality in responses generated by large language models (LLMs). This project aims to integrate various fact-checking tools into a unified framework and provide comprehensive evaluation pipelines. |
|
|
|
<img src="https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/assets/architecture.png" width="100%"> |
|
|
|
## Installation |
|
|
|
You can install the package from PyPI using pip: |
|
|
|
```bash |
|
pip install openfactcheck |
|
``` |
|
|
|
## Usage |
|
|
|
First, you need to initialize the OpenFactCheckConfig object and then the OpenFactCheck object. |
|
```python |
|
from openfactcheck import OpenFactCheck, OpenFactCheckConfig |
|
|
|
# Initialize the OpenFactCheck object |
|
config = OpenFactCheckConfig() |
|
ofc = OpenFactCheck(config) |
|
``` |
|
|
|
### Response Evaluation |
|
|
|
You can evaluate a response using the `ResponseEvaluator` class. |
|
|
|
```python |
|
# Evaluate a response |
|
result = ofc.ResponseEvaluator.evaluate(response: str) |
|
``` |
|
|
|
### LLM Evaluation |
|
|
|
We provide [FactQA](https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/src/openfactcheck/templates/llm/questions.csv), a dataset of 6480 questions for evaluating LLMs. Onc you have the responses from the LLM, you can evaluate them using the `LLMEvaluator` class. |
|
|
|
```python |
|
# Evaluate an LLM |
|
result = ofc.LLMEvaluator.evaluate(model_name: str, |
|
input_path: str) |
|
``` |
|
|
|
### Checker Evaluation |
|
|
|
We provide [FactBench](https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/src/openfactcheck/templates/factchecker/claims.jsonl), a dataset of 4507 claims for evaluating fact-checkers. Once you have the responses from the fact-checker, you can evaluate them using the `CheckerEvaluator` class. |
|
|
|
```python |
|
# Evaluate a fact-checker |
|
result = ofc.CheckerEvaluator.evaluate(checker_name: str, |
|
input_path: str) |
|
``` |
|
|
|
## Cite |
|
|
|
If you use OpenFactCheck in your research, please cite the following: |
|
|
|
```bibtex |
|
@article{wang2024openfactcheck, |
|
title = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs}, |
|
author = {Wang, Yuxia and Wang, Minghan and Iqbal, Hasan and Georgiev, Georgi and Geng, Jiahui and Nakov, Preslav}, |
|
journal = {arXiv preprint arXiv:2405.05583}, |
|
year = {2024} |
|
} |
|
|
|
@article{iqbal2024openfactcheck, |
|
title = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs}, |
|
author = {Iqbal, Hasan and Wang, Yuxia and Wang, Minghan and Georgiev, Georgi and Geng, Jiahui and Gurevych, Iryna and Nakov, Preslav}, |
|
journal = {arXiv preprint arXiv:2408.11832}, |
|
year = {2024} |
|
} |
|
|
|
@software{hasan_iqbal_2024_13358665, |
|
author = {Hasan Iqbal}, |
|
title = {hasaniqbal777/OpenFactCheck: v0.3.0}, |
|
month = {aug}, |
|
year = {2024}, |
|
publisher = {Zenodo}, |
|
version = {v0.3.0}, |
|
doi = {10.5281/zenodo.13358665}, |
|
url = {https://doi.org/10.5281/zenodo.13358665} |
|
} |
|
``` |
|
|