Skip to content

TotalVariation/Flattenet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FlatteNet 😅 😅 😅

Introduction

This is a partially official pytorch implementation accompanying the publication Flattenet: A Simple and Versatile Framework for Dense Pixelwise Prediction. This code repo only contains the semantic segmentation part. It is worth noting that a number of modifications have been made after the paper has been published in order to improve the performance. We evaluate the adapted method on PASCAL-Context and PASCAL VOC 2012.

To deal with the reduced feature resolution problem, we introduce a novel Flattening Module which takes as input the coarse-grained feature maps (patch-wise visual descriptors) produced by a Fully Convolutional Network (FCN) and then outputs dense pixel-wise visual descriptors. The process described above is represented in the schematic diagram below. A FCN equipped with the Flattening Module, which we refer to as FlatteNet, can accomplish various dense prediction tasks in an effective and efficient manner.

An illustration of the structure of Flattening Module is displayed below. We have newly incorporated a context aggregation component into the design, which is implemented as a pyramid pooling module or self-attention module.

The overall architecture is displayed below.

Experimental Results

The training configuration files are included in config directory. All the models are trained on two NVIDIA GTX1080Ti GPUs with the official Sync-BN.

Note: We adopt the tweaks to ResNet architecture proposed in the paper. The weights of pretrained model are converted from Gluon CV.

PASCAL-Context

The models are evaluated using six scales of [0.5; 0.75; 1; 1.25; 1.5; 1.75] and flipping.

Input size Backbone Context mIoU(59 classes) #Params GFLOPs
480x480 ResNet-101 PPM 53.3 49.7 40.0
512x512 ResNet-101 Self-Attention 53.9 48.4 47.0

Note: The GFLOPs of self-attention block are not calculated.

PASCAL VOC

The models are evaluated using six scales of [0.5; 0.75; 1; 1.25; 1.5; 1.75] and flipping on the PASCAL VOC 2012 test set. The input size is set to 512x512.

COCO pretrain Backbone Context mIoU #Params GFLOPs
No ResNet-101 PPM 83.09 49.7 45.4
No ResNet-101 Self-Attention 84.32 48.3 46.8
Yes ResNet-101 Self-Attention 85.69 48.3 46.8

Note: The GFLOPs of self-attention block are not calculated.

Installation and Data Preparation

Pytorch version: 1.3.1

Please refer to HRNet-Semantic-Segmentation for other details.

Train and Test

For example, train FlatteNet on PASCAL-Context with a batch size of 8 on 2 GPUs:

python -m torch.distributed.launch --nproc_per_node=2 tools/train.py --cfg config/pctx_res101_att.yml

For example, evaluating our model on the PASCAL-Context validation set with multi-scale and flip testing:

python tools/test.py --cfg  config/pctx_res101_att.yml \
                     TEST.MODEL_FILE output/pascal_ctx/pctx_res101_att/best.pth

Remarks

Despite the fact that the intention of this paper is to achieve my Master degree, I sincerely hope this work or code would be helpful for your research. If you have any problems, feel free to open an issue.

Moreover, I could not test our method on other datasets/benchmarks due to the limited access to computational resources. 😞 😞 😞

Note

Due to the limited effort, this repo will not be maintained. For those interested in this work, please follow the configuration files available in the config folder to train your own models. The pre-trained models will no longer be available. I apologize for any inconvenience caused and thank you for your interest.

Acknowledgement

The code has been largely taken from HRNet-Semantic-Segmentation .

Other References:

PyTorch-Encoding

Releases

No releases published

Packages

No packages published

Languages