Skip to content

Deep learning based method called hist2RNA to predict the expression of genes using digital images of stained tissue samples

License

Notifications You must be signed in to change notification settings

raktim-mondol/hist2RNA

Repository files navigation

hist2RNA: Predicting Gene Expression from Histopathology Images [Paper]

hist2RNA banner

Table of Contents

Introduction

hist2RNA is an efficient deep learning-based project that aims to predict gene expression from breast cancer histopathology images. This project employs a efficient architecture to unlock underlying genetic expression in breast cancer.

Features

  • A state-of-the-art deep learning model tailored for breast cancer histopathology images
  • Efficient prediction of gene expression from histopathology images which means less training time
  • User-friendly command-line interface
  • Comprehensive documentation and tutorials

Data Sources

The following data sources have been used in this project:

Requirements

  • Python 3.9+
  • Pytorch 2.0

Image preprocessing

Annotation and Patch Creation

Image Color Normalization

Installation

  1. Clone the repository: git clone https://github.com/raktim-mondol/hist2RNA.git

  2. Change directory to the cloned repository: cd hist2RNA

  3. Install the required packages: pip install -r requirements.txt

  4. Train the model:

python training_main.py --slides_dir ./data/slides/ --epochs 50 --batch_size 12 --lr 0.001
  1. Test the model:
python test_main.py --test_patient_id ./patient_details/test_patient_id.txt --checkpoint_file ./models/hist2RNA_model.pth

For most efficient way, use following code:

python step_1_feature_extraction.py

Then,

python step_2_model_training_.py

For detailed usage instructions, please refer to the documentation.

Peak results utilizing the hist2RNA methodology:

The following results show predictions for the PAM50 genes from histopathology test datatest images:

Spearman Correlation Coefficient [Updated]

Spearman Correlation Coefficient

AUC-RCH (A performance metric we've developed)

Reverse_cumulative_histogram

Gene prediction across patients:

It leverages the overall patterns of gene expression for each patient. This allows for a more holistic understanding of gene behavior across the population.

Gene prediction across genes:

This analysis focuses on the expression patterns of each gene individually. This reveals the significant variability in gene expression among different patients, which can lead to lower correlation coefficients.

Contributing

We welcome contributions to improve and expand the capabilities of hist2RNA! Please follow the contributing guidelines to get started.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Cite Us:

alt text If you find this code useful in your research, please consider citing:

@Article{cancers15092569,
AUTHOR = {Mondol, Raktim Kumar and Millar, Ewan K. A. and Graham, Peter H. and Browne, Lois and Sowmya, Arcot and Meijering, Erik},
TITLE = {hist2RNA: An Efficient Deep Learning Architecture to Predict Gene Expression from Breast Cancer Histopathology Images},
JOURNAL = {Cancers},
VOLUME = {15},
YEAR = {2023},
NUMBER = {9},
ARTICLE-NUMBER = {2569},
URL = {https://www.mdpi.com/2072-6694/15/9/2569},
ISSN = {2072-6694},
DOI = {10.3390/cancers15092569}
}

About

Deep learning based method called hist2RNA to predict the expression of genes using digital images of stained tissue samples

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages