Skip to content

leohsuofnthu/Tensorflow-YOLACT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YOLACT Real-time Instance Segmentation

Introduction

This is a Tensorflow 2 implementation of the paper YOLACT: Real-time Instance Segmentation accepted in ICCV2019. The paper presents a fully-convolutional model for real-instance segmentation based on extending the existing architecture for object detection and its own idea of parallel prototype generation. In this repo, my goal is to provide a general way to use this model, let users have more flexible options (custom dataset, different backbone choice, anchor scale and learning rate schedue) for their own specific need based on idea from original paper.

Model

Here is the illustration of YOLACT from original paper. ad

A. Dataset and Pre-processsing

1. Prepare the COCO 2017 TFRecord Dataset

2017 Train images / 2017 Val images / 2017 Annotations

Extract the /train2017, /val2017, and /annotations/instances_train2017.json, /annotations/instances_val2017.json from annotation to ./data folder of the repo, and run:

python -m  data.coco_tfrecord_creator -train_image_dir './data/train2017' 
                                      -val_image_dir './data/val2017' 
                                      -train_annotations_file './data/instances_train2017.json' 
                                      -val_annotations_file './data/instances_val2017.json' 
                                      -output_dir './data/coco'

2. Prepare the Pascal SBD Dataset

benchmark.tgz / Pascal SBD annotation (Here is the COCO-style annotation from original yolact repo)

Extract the /benchmark/dataset/img folder from benchmark.tgz, and pascal_sbd_train.json, pascal_sbd_valid.json from annotation to ./data folder of the repo. Divinding images into 2 folders (/pascal_train for training, /pascal_val for validation images.) and run:

python -m  data.coco_tfrecord_creator -train_image_dir './data/pascal_train' 
                                      -val_image_dir './data/pascal_val' 
                                      -train_annotations_file './data/pascal_sbd_train.json' 
                                      -val_annotations_file './data/pascal_sbd_valid.json' 
                                      -output_dir './data/pascal'

3. Prepare your Custom Dataset

Create a folder of training images, a folder of validation images, and a COCO-style annotation like above for your dataset in ./data folder of the repo, and run:

python -m  data.coco_tfrecord_creator -train_image_dir 'path to your training images' 
                                      -val_image_dir   'path to your validaiton images'  
                                      -train_annotations_file 'path to your training annotations' 
                                      -val_annotations_file 'path to your validation annotations' 
                                      -output_dir './data/name of the dataset'

Training

1. Configuration for COCO, Pascal SBD

The configuration for experiment can be adjust in config.py. The default hyperparameters from original paper are already written as example for you to know how to customize it. You can adjust following parameters:

Parameters for Parser

Parameters Description
NUM_MAX_PAD The maximum padding length for batching samples.
THRESHOLD_POS The positive threshold iou for anchor mathcing.
THRESHOLD_NEG The negative threshold iou for anchor mathcing.

Parameters for Model

Parameters Description
BACKBONE The name of backbone model defined in backbones_objects .
IMG_SIZE The input size of images.
PROTO_OUTPUT_SIZE Output size of protonet.
FPN_CHANNELS The Number of convolution channels used in FPN.
NUM_MASK The number of predicted masks for linear combination.

Parameters for Loss

Parameters for Loss Description
LOSS_WEIGHT_CLS The loss weight for classification.
LOSS_WEIGHT_BOX The loss weight for bounding box.
LOSS_WEIGHT_MASK The loss weight for mask prediction.
LOSS_WEIGHT_SEG The loss weight for segamentation.
NEG_POS_RATIO The neg/pos ratio for OHEM in classification.

Parameters for Detection

Parameters Description
CONF_THRESHOLD The threshold for filtering possible detection by confidence score.
TOP_K The maximum number of input possible detection for FastNMS.
NMS_THRESHOLD The threshold for FastNMS.
MAX_NUM_DETECTION The maximum number of detection.

2. Configuration for Custom Dataset (to be updated)

3. Check the Dataset Sample

4. Training Script

-> Training for COCO:

python train.py -name 'coco'
                -tfrecord_dir './data'
                -weights './weights' 
                -batch_size '8'
                -momentum '0.9'
                -weight_decay '5 * 1e-4'
                -print_interval '10'
                -save_interval '5000'

-> Training for Pascal SBD:

python train.py -name 'pascal'
                -tfrecord_dir './data'
                -weights './weights' 
                -batch_size '8'
                -momentum '0.9'
                -weight_decay '5 * 1e-4'
                -print_interval '10'
                -save_interval '5000'

-> Training for custom dataset:

python train.py -name 'name of your dataset'
                -tfrecord_dir './data'
                -weights 'path to store weights' 
                -batch_size 'batch_size'
                -momentum 'momentum for SGD'
                -weight_decay 'weight_decay rate for SGD'
                -print_interval 'interval for printing training result'
                -save_interval 'interval for evaluation'

Inference (to be updated)

There are serval evaluation scenario.

Test Detection

Evaluation

Images

Videos

Pretrain Weights (to be updated)

First Header Second Header
Content from cell 1 Content from cell 2
Content in the first column Content in the second column

Authors

  • HSU, CHIH-CHAO - Professional Machine Learning Master Student at Mila

Reference

Releases

No releases published

Packages

No packages published

Languages