Skip to content

williamcfrancis/Location-Based-Panoptic-Segmentation-with-SOLOv2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 

Repository files navigation

Location Based Efficient Panoptic Segmentation

Panoptic segmentation is a scene understanding problem that combines the prediction from both instance and semantic segmentation into a general unified output. This project implements a location-based panoptic segmentation model, modifying the state-of-the-art EfficientPS architecture by using SOLOv2 as the instance segmentation head instead of a Mask-RCNN.

System Requirements

  • Linux
  • Python 3.7
  • PyTorch 1.7
  • CUDA 10.2
  • GCC 7 or 8

Dependencies

Install the following frameworks

Install Dependencies

For EfficientPS

pip install -U albumentations
pip install pytorch-lightning
pip install inplace-abn
pip install efficientnet_pytorch
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
pip install git+https://github.com/cocodataset/panopticapi.git

For SOLOv2

Install the dependencies by running

pip install pycocotools
pip install numpy
pip install scipy
pip install torch==1.5.1 torchvision==0.6.1
pip install mmcv

Dataset Preparation

  1. Download the GtFine and leftimg8bit files of the Cityscapes dataset from https://www.cityscapes-dataset.com/ and unzip the leftImg8bit_trainvaltest.zip and gtFine_trainvaltest.zip into data/cityscapes
  2. The dataset needs to be converted into coco format using the conversion tool in mmdetection:
  • Clone the repository using git clone https://github.com/open-mmlab/mmdetection.git
  • Enter the repository using cd mmdetection
  • Install cityscapescripts using pip install cityscapesscripts
  • Run the script as
python tools/dataset_converters/cityscapes.py \
    data/cityscapes/ \
    --nproc 8 \
    --out-dir data/cityscapes/annotations
  1. Create the panoptic images json file:
  • Clone the repository using git clone https://github.com/mcordts/cityscapesScripts.git
  • Install it using pip install git+https://github.com/mcordts/cityscapesScripts.git
  • Run the script using python cityscapesScripts/cityscapesscripts/preparation/createPanopticImgs.py

Now the folder structure for the dataset should look as follows:

EfficientPS
└── data
    └── cityscapes
        ├── annotations
        ├── train
        ├── cityscapes_panoptic_val.json
        └── val

How to train

SOLOv2

  • Go into the SOLOv2 folder using cd SOLOv2
  • Modify config.yaml to change the paths
  • Run python setup.py develop
  • Run train.py

EfficientPS

  • Go into the SOLOv2 folder using cd .. and cd EfficientPS
  • Run train_net.py

How to run inference

  1. Go into the SOLOv2 folder using cd SOLOv2
  2. Run python eval.py. This will save the SOLOv2 masks in EfficientPS/solo_outputs
  3. Now go into the EfficientPS folder using cd .. and cd EfficientPS
  4. Run the combined evaluation using python solo_fusion.py

The results will be saved in EfficientPS/Outputs

Why SOLOv2?

image

EfficientPS Architecture

The original EfficientPS paper: here
Code from the authors of EfficientPS: here

image

Why EfficientPS?

Early research explored various techniques for Instance segmentation and Semantic segmentation separately. Initial panoptic segmentation methods heuristically combine predictions from state-of-the-art instance segmentation network and semantic segmentation network in a post-processing step. However, they suffered from large computational overhead, redundancy in learning and discrepancy between the predictions of each network.
Recent works implemented top-down manner with shared components or in a bottom-up manner sequentially. This again did not utilize component sharing and suffered from low computational efficiency, slow runtimes and subpar results.
EfficientPS:

  • Shared backbone: EfficientNet
  • Feature aligning semantic head, modified Mask R-CNN
  • Panoptic fusion module: dynamic fusion of logits based on mask confidences
  • Jointly optimized end-to-end, Depth-wise separable conv, Leaky ReLU
  • 2 way FPN : semantically rich multiscale features

Novelty of this approach

We replace the Mask-RCNN architecture from the instance head with a SOLOv2 architecture in order to improve the instance segmentation of the EfficientPS model.
The Mask-RCNN losses now will be replaced by SOLOv2’s Focal Loss for semantic category classification and DiceLoss for mask prediction.
This approach of using a location-based instance segmentation for panoptic segmentation will improve upon the performance metrics.

Results

image

image image

About

Modification to the state-of-the-art EfficientPS model by replacing its instance head (Mask-RCNN) with SOLOv2, making it a location based-panoptic segmentation model

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published