| <!-- | |
| title: OpenFactCheck | |
| emoji: ✅ | |
| colorFrom: green | |
| colorTo: purple | |
| sdk: streamlit | |
| app_file: src/openfactcheck/app/app.py | |
| pinned: false | |
| --> | |
| <p align="center"> | |
| <img alt="OpenFactCheck Logo" src="https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/assets/splash.png" height="120" /> | |
| <p align="center">An Open-source Factuality Evaluation Demo for LLMs | |
| <br> | |
| </p> | |
| </p> | |
| --- | |
| <p align="center"> | |
| <a href="https://github.com/hasaniqbal777/OpenFactCheck/actions/workflows/release.yaml"> | |
| <img src="https://img.shields.io/github/actions/workflow/status/hasaniqbal777/openfactcheck/release.yaml?logo=github&label=Release" alt="Release"> | |
| </a> | |
| <a href="https://readthedocs.org/projects/openfactcheck/builds/"> | |
| <img alt="Docs" src="https://img.shields.io/readthedocs/openfactcheck?logo=readthedocs&label=Docs"> | |
| </a> | |
| <br> | |
| <a href="https://gnu.org/licenses/gpl-3.0.html"> | |
| <img src="https://img.shields.io/github/license/hasaniqbal777/openfactcheck" alt="License"> | |
| </a> | |
| <a href="https://pypi.org/project/openfactcheck/"> | |
| <img src="https://img.shields.io/pypi/pyversions/openfactcheck.svg" alt="Python Version"> | |
| </a> | |
| <a href="https://pypi.org/project/openfactcheck/"> | |
| <img src="https://img.shields.io/pypi/v/openfactcheck.svg" alt="PyPI Latest Release"> | |
| </a> | |
| <a href="https://arxiv.org/abs/2405.05583"><img src="https://img.shields.io/badge/arXiv-2405.05583-B31B1B" alt="arXiv"></a> | |
| <a href="https://zenodo.org/doi/10.5281/zenodo.13358664"><img src="https://img.shields.io/badge/DOI-10.5281/zenodo.13358664-blue" alt="DOI"></a> | |
| </p> | |
| --- | |
| <p align="center"> | |
| <a href="#overview">Overview</a> • | |
| <a href="#installation">Installation</a> • | |
| <a href="#usage">Usage</a> • | |
| <a href="https://huggingface.co/spaces/hasaniqbal777/OpenFactCheck">HuggingFace Demo</a> • | |
| <a href="https://openfactcheck.readthedocs.io/">Documentation</a> | |
| </p> | |
| ## Overview | |
| OpenFactCheck is an open-source repository designed to facilitate the evaluation and enhancement of factuality in responses generated by large language models (LLMs). This project aims to integrate various fact-checking tools into a unified framework and provide comprehensive evaluation pipelines. | |
| <img src="https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/assets/architecture.png" width="100%"> | |
| ## Installation | |
| You can install the package from PyPI using pip: | |
| ```bash | |
| pip install openfactcheck | |
| ``` | |
| ## Usage | |
| First, you need to initialize the OpenFactCheckConfig object and then the OpenFactCheck object. | |
| ```python | |
| from openfactcheck import OpenFactCheck, OpenFactCheckConfig | |
| # Initialize the OpenFactCheck object | |
| config = OpenFactCheckConfig() | |
| ofc = OpenFactCheck(config) | |
| ``` | |
| ### Response Evaluation | |
| You can evaluate a response using the `ResponseEvaluator` class. | |
| ```python | |
| # Evaluate a response | |
| result = ofc.ResponseEvaluator.evaluate(response: str) | |
| ``` | |
| ### LLM Evaluation | |
| We provide [FactQA](https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/src/openfactcheck/templates/llm/questions.csv), a dataset of 6480 questions for evaluating LLMs. Onc you have the responses from the LLM, you can evaluate them using the `LLMEvaluator` class. | |
| ```python | |
| # Evaluate an LLM | |
| result = ofc.LLMEvaluator.evaluate(model_name: str, | |
| input_path: str) | |
| ``` | |
| ### Checker Evaluation | |
| We provide [FactBench](https://raw.githubusercontent.com/hasaniqbal777/OpenFactCheck/main/src/openfactcheck/templates/factchecker/claims.jsonl), a dataset of 4507 claims for evaluating fact-checkers. Once you have the responses from the fact-checker, you can evaluate them using the `CheckerEvaluator` class. | |
| ```python | |
| # Evaluate a fact-checker | |
| result = ofc.CheckerEvaluator.evaluate(checker_name: str, | |
| input_path: str) | |
| ``` | |
| ## Cite | |
| If you use OpenFactCheck in your research, please cite the following: | |
| ```bibtex | |
| @article{wang2024openfactcheck, | |
| title = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs}, | |
| author = {Wang, Yuxia and Wang, Minghan and Iqbal, Hasan and Georgiev, Georgi and Geng, Jiahui and Nakov, Preslav}, | |
| journal = {arXiv preprint arXiv:2405.05583}, | |
| year = {2024} | |
| } | |
| @article{iqbal2024openfactcheck, | |
| title = {OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs}, | |
| author = {Iqbal, Hasan and Wang, Yuxia and Wang, Minghan and Georgiev, Georgi and Geng, Jiahui and Gurevych, Iryna and Nakov, Preslav}, | |
| journal = {arXiv preprint arXiv:2408.11832}, | |
| year = {2024} | |
| } | |
| @software{hasan_iqbal_2024_13358665, | |
| author = {Hasan Iqbal}, | |
| title = {hasaniqbal777/OpenFactCheck: v0.3.0}, | |
| month = {aug}, | |
| year = {2024}, | |
| publisher = {Zenodo}, | |
| version = {v0.3.0}, | |
| doi = {10.5281/zenodo.13358665}, | |
| url = {https://doi.org/10.5281/zenodo.13358665} | |
| } | |
| ``` | |