Spaces:

UTAustin-AIHealth
/

README

Configuration error

App Files Files Community

README / README.md

SP2001

Update README.md

d1bcc4d verified 28 days ago

preview code

raw

history blame contribute delete

2.08 kB

	# UTAustin-AIHealth

	Welcome to UTAustin-AIHealth – a hub dedicated to advancing research in medical AI.
	This repo contains the MedHallu dataset, which underpins our recent work:

	MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models

	MedHallu is a rigorously designed benchmark intended to evaluate large language models' ability to detect hallucinations in medical question-answering tasks.
	The dataset is organized into two distinct splits:

	- pqa_labeled: Contains 1,000 high-quality, human-annotated samples derived from PubMedQA.
	- pqa_artificial: Contains 9,000 samples generated via an automated pipeline from PubMedQA.

	---

	## Setup Environment

	To work with the MedHallu dataset, please install the Hugging Face `datasets` library using pip:

	```bash
	pip install datasets
	```

	## How to Use MedHallu

	Downloading the Dataset:
	```python
	from datasets import load_dataset

	# Load the 'pqa_labeled' split: 1,000 high-quality, human-annotated samples.
	medhallu_labeled = load_dataset("UTAustin-AIHealth/MedHallu", "pqa_labeled")

	# Load the 'pqa_artificial' split: 9,000 samples generated via an automated pipeline.
	medhallu_artificial = load_dataset("UTAustin-AIHealth/MedHallu", "pqa_artificial")
	```

	---


	## License

	This dataset and associated resources are distributed under the [MIT License](https://opensource.org/license/mit/).

	## Citations

	If you find MedHallu useful in your research, please consider citing our work:

	```bibtex
	@misc{pandit2025medhallucomprehensivebenchmarkdetecting,
	title={MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models},
	author={Shrey Pandit and Jiawei Xu and Junyuan Hong and Zhangyang Wang and Tianlong Chen and Kaidi Xu and Ying Ding},
	year={2025},
	eprint={2502.14302},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2502.14302},
	}
	```

	## Contact
	For further information or inquiries about MedHallu, please reach out at [email protected]