GitHub - YyzHarry/vlm-fairness: Demographic Bias of Vision-Language Foundation Models in Medical Imaging

Demographic Bias of Expert-Level Vision-Language Foundation Models in Medical Imaging

Summary: Advances in artificial intelligence (AI) have achieved expert-level performance in medical imaging applications. Notably, self-supervised vision-language foundation models can detect a broad spectrum of pathologies without relying on explicit training annotations. However, it is crucial to ensure that these AI models do not mirror or amplify human biases, thereby disadvantaging historically marginalized groups such as females or Black patients. The manifestation of such biases could systematically delay essential medical care for certain patient subgroups. In this study, we investigate the algorithmic fairness of state-of-the-art vision-language foundation models in chest X-ray diagnosis across five globally-sourced datasets. Our findings reveal that compared to board-certified radiologists, these foundation models consistently underdiagnose marginalized groups, with even higher rates seen in intersectional subgroups, such as Black female patients. Such demographic biases present over a wide range of pathologies and demographic attributes. Further analysis of the model embedding uncovers its significant encoding of demographic information. Deploying AI systems with these biases in medical imaging can intensify pre-existing care disparities, posing potential challenges to equitable healthcare access and raising ethical questions about their clinical application.

Dataset

To download all the datasets used in this study, please follow instructions in DataSources.md.

As the original image files are often high resolution, we cache the images as downsampled copies to speed training up for certain datasets. To do so, run

python -m scripts.cache_cxr --data_path <data_path> --dataset <dataset>

where datasets can be mimic or vindr. This process is required for vindr, and is optional for the remaining datasets.

Model Checkpoints

This repo uses CheXzero as a driving example for vision-language models. Download model checkpoints of CheXzero and save them in the ./checkpoints directory.

Zero-Shot Evaluation

python -m zero_shot \
       --dataset <dataset> \
       --split <split> \
       --template <name_of_your_prompt_template> \
       --data_dir <data_path> \
       --model_dir <model_path> \
       --predictions_dir <output_path>

Acknowledgements

This code is partly based on the open-source implementations from CheXzero and SubpopBench.

Citation

If you find this code or idea useful, please cite our work:

@article{yang2024demographic,
  title={Demographic Bias of Expert-Level Vision-Language Foundation Models in Medical Imaging},
  author={Yuzhe Yang and Yujia Liu and Xin Liu and Avanti Gulhane and Domenico Mastrodicasa and Wei Wu and Edward J Wang and Dushyant W Sahani and Shwetak Patel},
  journal={arXiv preprint arXiv:2402.14815},
  year={2024}
}

Contact

If you have any questions, feel free to contact us through email (yuzhe@mit.edu) or GitHub issues. Enjoy!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
checkpoints		checkpoints
configs/template		configs/template
dataset		dataset
models		models
scripts		scripts
utils		utils
.gitignore		.gitignore
DataSources.md		DataSources.md
LICENSE		LICENSE
README.md		README.md
config.py		config.py
metrics.py		metrics.py
zero_shot.py		zero_shot.py

License

YyzHarry/vlm-fairness

Folders and files

Latest commit

History

Repository files navigation

Demographic Bias of Expert-Level Vision-Language Foundation Models in Medical Imaging

Dataset

Model Checkpoints

Zero-Shot Evaluation

Acknowledgements

Citation

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages