Skip to content

Latest commit

 

History

History
62 lines (43 loc) · 10.8 KB

README.md

File metadata and controls

62 lines (43 loc) · 10.8 KB

DNLNet

Disentangled Non-Local Neural Networks

Introduction

Official Repo

Code Snippet

Abstract

The non-local block is a popular module for strengthening the context modeling ability of a regular convolutional neural network. This paper first studies the non-local block in depth, where we find that its attention computation can be split into two terms, a whitened pairwise term accounting for the relationship between two pixels and a unary term representing the saliency of every pixel. We also observe that the two terms trained alone tend to model different visual clues, e.g. the whitened pairwise term learns within-region relationships while the unary term learns salient boundaries. However, the two terms are tightly coupled in the non-local block, which hinders the learning of each. Based on these findings, we present the disentangled non-local block, where the two terms are decoupled to facilitate learning for both terms. We demonstrate the effectiveness of the decoupled design on various tasks, such as semantic segmentation on Cityscapes, ADE20K and PASCAL Context, object detection on COCO, and action recognition on Kinetics.

Results and models (in progress)

Cityscapes

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) Device mIoU mIoU(ms+flip) config download
DNLNet R-50-D8 512x1024 40000 7.3 2.56 V100 78.61 - config model | log
DNLNet R-101-D8 512x1024 40000 10.9 1.96 V100 78.31 - config model | log
DNLNet R-50-D8 769x769 40000 9.2 1.50 V100 78.44 80.27 config model | log
DNLNet R-101-D8 769x769 40000 12.6 1.02 V100 76.39 77.77 config model | log
DNLNet R-50-D8 512x1024 80000 - - V100 79.33 - config model | log
DNLNet R-101-D8 512x1024 80000 - - V100 80.41 - config model | log
DNLNet R-50-D8 769x769 80000 - - V100 79.36 80.70 config model | log
DNLNet R-101-D8 769x769 80000 - - V100 79.41 80.68 config model | log

ADE20K

Method Backbone Crop Size Lr schd Mem (GB) Inf time (fps) Device mIoU mIoU(ms+flip) config download
DNLNet R-50-D8 512x512 80000 8.8 20.66 V100 41.76 42.99 config model | log
DNLNet R-101-D8 512x512 80000 12.8 12.54 V100 43.76 44.91 config model | log
DNLNet R-50-D8 512x512 160000 - - V100 41.87 43.01 config model | log
DNLNet R-101-D8 512x512 160000 - - V100 44.25 45.78 config model | log

Notes

This example is to reproduce "Disentangled Non-Local Neural Networks" for semantic segmentation. It is still in progress.

Citation

@misc{yin2020disentangled,
    title={Disentangled Non-Local Neural Networks},
    author={Minghao Yin and Zhuliang Yao and Yue Cao and Xiu Li and Zheng Zhang and Stephen Lin and Han Hu},
    year={2020},
    booktitle={ECCV}
}