mmdet-nucleus-instance-segmentation

Report

This repository is implementation of homework3 for IOC5008 Selected Topics in Visual Recognition using Deep Learning course in 2021 fall semester at National Yang Ming Chiao Tung University.

In this homework, we participate in nuclei segmentation challenge on CodaLab. In this challenge, we perform instance segmentation on TCGA nuclei dataset from the 2018 Kaggle Data Science Bowl. This dataset contains 24 training images with 14,598 nuclei and 6 test images with 2,360 nuclei. For training, pre-trained models are allowed, but no external data should be used. We apply four existing methods to solve this challenge.

Getting the code

You can download a copy of all the files in this repository by cloning this repository:

git clone https://github.com/joycenerd/mmdet-nucleus-instance-segmentation.git

Requirements

You need to have Anaconda or Miniconda already installed in your environment. To install requirements:

1. Create a conda environment

conda create -n openmmlab python=3.7 -y
conda activate openmmlab

2. Install mmdetection

conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=11.0 -c pytorch
pip install openmim
mim install mmdet

3. Update mmdetection to custom need

cd mmdetection
python setup.py install

For more information please visit: get_started.md

4. Install imantics (for converting COCO segmentation annotation)

pip install imantics

Dataset

You can choose to download the data that we have pre-processed already or you can download the raw data.

Option#1: Download the data that have been pre-processed

Download the data from the Google drive link: nucleus_data.zip
After decompress the zip file, the data folder structure should look like this:

nucleus_data
├── all_train
│   ├── TCGA-18-5592-01Z-00-DX1.png
│   ├── TCGA-21-5784-01Z-00-DX1.png
│   ├── TCGA-21-5786-01Z-00-DX1.png
│   ├── ......
├── annotations
│   ├── instance_all_train.json
│   ├── instance_test.json
│   ├── instance_train.json
│   ├── instance_val.json
│   └── test_img_ids.json
├── classes.txt
├── test
│   ├── TCGA-50-5931-01Z-00-DX1.png
│   ├── TCGA-A7-A13E-01Z-00-DX1.png
│   ├── ......
├── train
│   ├── TCGA-18-5592-01Z-00-DX1.png
│   ├── TCGA-21-5786-01Z-00-DX1.png
│   ├── ......
└── val
    ├── TCGA-21-5784-01Z-00-DX1.png
    ├── TCGA-B0-5711-01Z-00-DX1.png
    ├── ......

Option#2: Download the raw data

Download the data from the Google drive link: dataset.zip
After decompress the zip file, the data folder structure should look like this:

dataset
├── test
│   ├── .ipynb_checkpoints
│   │   ├── TCGA-50-5931-01Z-00-DX1-checkpoint.png
│   │   ├── TCGA-AY-A8YK-01A-01-TS1-checkpoint.png
│   │   ├── TCGA-G9-6336-01Z-00-DX1-checkpoint.png
│   │   └── TCGA-G9-6348-01Z-00-DX1-checkpoint.png
│   ├── TCGA-50-5931-01Z-00-DX1.png
│   ├── TCGA-A7-A13E-01Z-00-DX1.png
│   ├── TCGA-AY-A8YK-01A-01-TS1.png
│   ├── TCGA-G2-A2EK-01A-02-TSB.png
│   ├── TCGA-G9-6336-01Z-00-DX1.png
│   └── TCGA-G9-6348-01Z-00-DX1.png
├── test_img_ids.json
└── train
    ├── TCGA-18-5592-01Z-00-DX1
    │   ├── images
    │   │   └── TCGA-18-5592-01Z-00-DX1.png
    │   └── masks
    │       ├── .ipynb_checkpoints
    │       │   └── mask_0002-checkpoint.png
    │       ├── mask_0001.png
    │       ├── mask_0002.png
    │       ├── ......
    ├── TCGA-RD-A8N9-01A-01-TS1
    │   ├── images
    │   │   └── TCGA-RD-A8N9-01A-01-TS1.png
    │   └── masks
    │       ├── mask_0001.png
    │       ├── mask_0002.png
    │       ├── ......
    └── ......

Data pre-processing

Note: If you download the data by following option#1 you can skip this step.

If your raw data folder structure is different, you will need to modify train_valid_split.py and mask2coco.py before executing the code.

1. train valid split

In default we split the whole training set to 80% for training and 20% for validation.

python train_valid_split.py --data-root <save_dir>/dataset/train --ratio 0.2 --out-dir <save_dir>/nucleus_data

input: original whole training image directory
output: new data dir name nucleus_data, inside this directory there will be to folders train/ and val/ with images inside

2. convert binary mask images into COCO segmentation annotation.

python mask2coco.py --mode <train_or_val> --data_root <save_dir>/nucleus_data/<train_or_val> --mask_root <save_dir>/dataset/train --out_dir <save_dir>/nucleus_data/annotations

input:
1. train or val folder path from the last step
2. binary mask saving root directory
output: instance_train.json or instance_val.json in nucleus_data/annotations/

Training

You should have Graphics card to train the model. For your reference, we trained on a single NVIDIA Tesla V100.

1. Download the pre-trained weights (pre-trained on COCO)

Model	Backbone	Lr_schd	Download
Mask RCNN	R50	3x	model
Mask RCNN	X101	3x	model
Cascade Mask RCNN	R50	3x	model
Cascade Mask RCNN	X101	3x	model
PointRend	R50	3x	model
Mask Scoring RCNN	X101	1x	model

2. Modify config file

Go to Results and Models and find model configuration you want to train. You will need to modify the configuration file in order to train the model. Things you need to modify are:

ann_file and img_prefix in the data section
Put the downloaded pre-trained weights path in load_from

Tips: We get a better results when using all 24 images for training. You can try img_prefix: all_train and ann_file: instance_all_train.json

Also, you can find all my custom configuration files in mmdetection/configs/nucleus, you can modify with your own need.

3. Train the model

python tools/train.py <config_file_path> --work-dir <save_dir>/train

input: model configuration file
output: checkpoints every epoch and training logs will be saved in <save_dir>/train

Validation

In the configuration file, the testing ann_file and img_prefix should put the validation data path, not the testing data path because test data doesn't has ground truth.

python tools/test.py <config_file_path> <save_dir>/train/epoch<X>.pth --eval bbox segm --work-dir <save_dir>/val

input:
- model configuration file
- checkpoint you save at the last step
output: validation logs

Testing

1. Convert image to coco format

Note: If you download the data by following option#1 in Dataset section you can skip this step.

python tools/dataset_converters/images2coco.py <data_dir>/nucleus_data/test <data_dir>/nucleus_data/classes.txt instance_test.json --imgid_json <data_dir>/nucleus_data/annotations/test_img_ids.json

input:
- test image directory
- classes.txt: class names
- test_img_ids.json: test image id
output: instance_test.json

2. Generate testing results

Before testing, please ensure the test image folder path and the path of instance_test.json are correct in the model configuration file

python tools/test.py <config_file_path> <save_dir>/train/epoch_<X>.json --format-only --options "jsonfile_prefix=test" --show

input:
- model configuration file
- trained model checkpoint
output:
- test.segm.json: instance segmentation results
- test.bbox.json: detection results

Submit the results

rename the result file: mv test.segm.json answer.json
compress the file: zip answer.zip answer.json
upload the result to CodaLab to get the testing score

Results and Models

Model	Backbone	Lr_schd	Mask AP	Config	Download
Mask RCNN	R50	3x	0.2323	config	model
Mask RCNN	X101	3x	0.2316	config	-
Cascade Mask RCNN	R50	3x	0.2428	config	model
Cascade Mask RCNN	X101	3x	0.2444	config	model
PointRend	R50	3x	0.2439	config	model
Mask Scoring RCNN	X101	1x	0.2420	config	model

Inference

Note we use Cascade Mask RCNN as our model with X101 as our backbone To reproduce our best results, do the following steps:

Getting the code
Install the dependencies
Download the data: please download the data by following option#1
Download pre-trained weights
Modify config file:
Download checkpoints
Testing
Submit the results

FAQ

If any problem occurs when you are using this project, you can first check out faq.md to see if there are solutions to your problem.

GitHub Acknowledgement

We thank the authors of these repositories:

Citation

If you find our work useful in your project, please cite:

@misc{
    title = {mmdet-nucleus-instance-segmentation},
    author = {Zhi-Yi Chin},
    url = {https://github.com/joycenerd/mmdet-nucleus-instance-segmentation},
    year = {2021}
}

Contributing

If you'd like to contribute, or have any suggestions, you can contact us at joycenerd.cs09@nycu.edu.tw or open an issue on this GitHub repository.

All contributions welcome! All content in this repository is licensed under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
docs		docs
mmdetection		mmdetection
submission		submission
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
mask2coco.py		mask2coco.py
train_valid_split.py		train_valid_split.py

License

joycenerd/mmdet-nucleus-instance-segmentation

Folders and files

Latest commit

History

Repository files navigation