Classification of ISUP grades from Whole Slide Images - DLMI Kaggle Challenge

Authors: Apavou Clément & Belkada Younes

The kaggle challenge is the following : https://www.kaggle.com/c/mvadlmi/leaderboard

🔎 Introduction

With more than 1 million new diagnoses reported every year, prostate cancer (PCa) is the second most common cancer among males worldwide that results in more than 350,000 deaths annually. The key to decreasing mortality is developing more precise diagnostics. Diagnosis of PCa is based on the grading of prostate tissue biopsies. These tissue samples are examined by a pathologist and scored according to the Gleason grading system. In this challenge, you will develop models for detecting PCa on images of prostate tissue samples, and estimate severity of the disease using the most extensive multi-center dataset on Gleason grading yet available.

The grading process consists of finding and classifying cancer tissue into so-called Gleason patterns (3, 4, or 5) based on the architectural growth patterns of the tumor (Fig. below). After the biopsy is assigned a Gleason score, it is converted into an ISUP grade on a 1-5 scale. The Gleason grading system is the most important prognostic marker for PCa, and the ISUP grade has a crucial role when deciding how a patient should be treated. There is both a risk of missing cancers and a large risk of overgrading resulting in unnecessary treatment. However, the system suffers from significant inter-observer variability between pathologists, limiting its usefulness for individual patients. This variability in ratings could lead to unnecessary treatment, or worse, missing a severe diagnosis.

The goal of this challenge is to predict the ISUP Grade using only Histopathology images. For that, we dealt with the process of Whole Slide Images as huge gigapixel images and deal with the limited number of patients provided in the train set.

Classes: [0, 1, 2, 3, 4, 5]

🔨 Getting started

Download the dataset and extract it in the assets folder.

Chose the mode that you want:

Classification: Classify isup grade of images
Segmentation: Semantic segmentation on images
Classif_WITH_Seg: Classification using a semantic segmentation models trained with Segmentation

Chose a dataset and a model adapted to the mode.
Models for:

Check dataset in datasets.py

Feature extractor from timm library.

⭐ Best model with segmentation (Final Submission)

Name method: Concatenate top patches

MODE: Classif_WITH_Seg
dataset_name: ConcatTopPatchDataset
feature_extractor_name: tresnet_xl_448
network_name: SimpleModel

Command line to train the model:

python main.py --train True --MODE Classif_WITH_Seg --dataset_name ConcatTopPatchDataset --patch_size 256 --nb_samples 16 --max_epochs 150 --batch_size 2 --accumulate_grad_batches 8 --discounted_draw False --percentage_blank 0.5 --resized_img 512 --seed_everything 6836

drawn-dream-632 is the name of the wandb run of the segmentation model trained with our framework (mode Segmentation)

Command line to create submission csv file:

python main.py --train False --MODE Classif_WITH_Seg --dataset_name ConcatTopPatchDataset --patch_size 256 --nb_samples 16 --discounted_draw False --percentage_blank 0.5 --resized_img 512 --best_model rich-jazz-915

rich-jazz-915 is the name of the wandb run with weights of the model. (Name change if you train your model yourself)

Model	Backbone	Area Under ROC (weighted) validation	Area Under ROC (macro) test (private leaderboard)	Run
SimpleModel	tresnet_xl_448	0.8126	0.8833

Top patches concatenated from a wsi images. Prediction: 4, Label: 4.

⭐ Best model without segmentation (Submission)

Name method: Concatenate random patches

MODE: Classification
dataset_name: ConcatPatchDataset
feature_extractor_name: tresnet_xl_448
network_name: SimpleModel

Command line to train the model:

python main.py --train True --MODE Classification --dataset_name ConcatPatchDataset --patch_size 256 --nb_samples 36 --max_epochs 150 --batch_size 2 --accumulate_grad_batches 16 --discounted_draw True --seed_everything 6130

Command line to create submission csv file:

python main.py --train False --MODE Classification --dataset_name ConcatPatchDataset --patch_size 256 --nb_samples 36 --discounted_draw True --best_model denim-terrain-844

denim-terrain-844 is the name of the wandb run with weights of the model. (Name change if you train your model yourself)

Model	Backbone	Area Under ROC (weighted) validation	Area Under ROC (macro) test (private leaderboard) without voting	with voting	Run
SimpleModel	tresnet_xl_448	0.8034	[0.8774, 0.92647]	0.8641

Random patches concatenated from a wsi images. Left: label 1, Radboud provider, Right: label 1, Karolinska provider

🎨 Semantic Segmentation

MODE: Segmentation
dataset_name: PatchSegDataset

Model	Backbone	Data provider	Patch Size	Level	IoU (average over classes) validation
DeepLabV3Plus	resnet152	All	384	1	0.7858
DeepLabV3Plus	resnet34	Radboud	512	0	0.7029
DeepLabV3Plus	resnet34	Karolinska	512	0	0.5958

Karolinska is composed of 3 classes:

0: background (non tissue) or unknown
1: benign tissue (stroma and epithelium combined)
2: cancerous tissue (stroma and epithelium combined)

Radboud is composed of 6 classes:

0: background (non tissue) or unknown
1: stroma (connective tissue, non-epithelium tissue)
2: healthy (benign) epithelium
3: cancerous epithelium (Gleason 3)
4: cancerous epithelium (Gleason 4)
5: cancerous epithelium (Gleason 5)

We merged in 3 classes to have the same number as karolinska:

0: background (non tissue) or unknown {0}
1: benign tissue (stroma and epithelium combined) {1,2}
2: cancerous tissue (stroma and epithelium combined) {3,4,5}

Segmentation of a Patch 384x384 from WSI: Patch, Prediction, Ground Truth

Segmentation of a Patch 384x384 from WSI of the Radboud data provider: Patch, Prediction, Ground Truth

Blue: background or unknown
Red: benign tissue
Green: Cancerous tissue

⭐ Best Segmentation Model

MODE: Segmentation
dataset_name: PatchSegDataset
network_name: DeepLabV3Plus
feature_extractor_name: resnet152

python main.py --train True --MODE Segmentation --dataset_name PatchSegDataset --dataset_static False --max_epochs 150 --batch_size 4 --accumulate_grad_batches 16 --nb_samples 4 --patch_size 384 --percentage_blank 0.5 --level 1 --seed_everything 4882

Model	Backbone	Data Provider	mIoU validation	Run
DeepLabV3Plus	resnet152	Both	0.7858

Name		Name	Last commit message	Last commit date
Latest commit History 376 Commits
.github/workflows		.github/workflows
agents		agents
assets		assets
config		config
datasets		datasets
models		models
utils		utils
.gitignore		.gitignore
DLMI_Kaggle_Report_Apavou_Belkada.pdf		DLMI_Kaggle_Report_Apavou_Belkada.pdf
README.md		README.md
count_params.py		count_params.py
data_analysis.ipynb		data_analysis.ipynb
main.py		main.py
requirements.txt		requirements.txt

clementapa/Prostate-Cancer-Image-Classification

Folders and files

Latest commit

History

Repository files navigation

Classification of ISUP grades from Whole Slide Images - DLMI Kaggle Challenge

🔎 Introduction

🔨 Getting started

⭐ Best model with segmentation (Final Submission)

⭐ Best model without segmentation (Submission)

🎨 Semantic Segmentation

⭐ Best Segmentation Model

About

Topics

Resources

Stars

Watchers

Forks

Languages