OPERA is an OPEn Respiratory Acoustic foundation model pretraining and benchmarking system. We curate large-scale respiratory audio datasets (136K samples, 440 hours), pretrain three pioneering foundation models, and build a benchmark consisting of 19 downstream respiratory health tasks for evaluation. Our pretrained models demonstrate superior performance (against existing acoustic models pretrained with general audio on 16 out of 19 tasks) and generalizability (to unseen datasets and new respiratory audio modalities). This highlights the great promise of respiratory acoustic foundation models and encourages more studies using OPERA as an open resource to accelerate research on respiratory audio for health.
Usage
The code is available at: https://github.com/evelyn0414/OPERA
Example for extracting feature using your own data:
from src.benchmark.model_util import extract_opera_feature
# array of filenames
sound_dir_loc = np.load(feature_dir + "sound_dir_loc.npy")
opera_features = extract_opera_feature(sound_dir_loc, pretrain="operaCT", input_sec=8, dim=768)
np.save(feature_dir + "operaCT_feature.npy", np.array(opera_features))
Citation
Kindly cite our work if you find it useful.
@misc{zhang2024openrespiratoryacousticfoundation,
title={Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking},
author={Yuwei Zhang and Tong Xia and Jing Han and Yu Wu and Georgios Rizos and Yang Liu and Mohammed Mosuily and Jagmohan Chauhan and Cecilia Mascolo},
year={2024},
eprint={2406.16148},
archivePrefix={arXiv},
primaryClass={cs.SD},
url={https://arxiv.org/abs/2406.16148},
}