Adaptive Multi-Modal Cross-Entropy Loss for Stereo Matching

arXiv | Poster | Video

Abstract

Despite the great success of deep learning in stereo matching, recovering accurate disparity maps is still challenging. Currently, L1 and cross-entropy are the two most widely used losses for stereo network training. Compared with the former, the latter usually performs better thanks to its probability modeling and direct supervision to the cost volume. However, how to accurately model the stereo ground-truth for cross-entropy loss remains largely under-explored. Existing works simply assume that the ground-truth distributions are uni-modal, which ignores the fact that most of the edge pixels can be multi-modal. In this paper, a novel adaptive multi-modal cross-entropy loss (ADL) is proposed to guide the networks to learn different distribution patterns for each pixel. Moreover, we optimize the disparity estimator to further alleviate the bleeding or misalignment artifacts in inference. Extensive experimental results on public datasets show that our method is general and can help classic stereo networks regain state-of-the-art performance. In particular, GANet with our method ranks $1^{st}$ on both the KITTI 2015 and 2012 benchmarks among the published methods. Meanwhile, excellent synthetic-to-realistic generalization performance can be achieved by simply replacing the traditional loss with ours.

Environment

python == 3.9.12
pytorch == 1.11.0
torchvision == 0.12.0
numpy == 1.21.5
apex == 0.1

Datasets

Download the datasets, and change the datapath args. in ./scripts/sceneflow.sh or ./scripts/kitti.sh.

Training

We use the distributed data parallel (DDP) to train the model.

Please execute the bash shell in ./scripts/, as:

/bin/bash ./scripts/sceneflow.sh
/bin/bash ./scripts/kitti.sh

Training logs are saved in ./log/.

Change loss_func args. for different losses:

SL1: smooth L1 loss
ADL: ADaptive multi-modal cross-entropy Loss

If you want to train the GANet, please install the NVIDIA-Apex package and compile the GANet libs.

Evaluation

Please uncomment and execute val.py.

EPE, 1px, 2px, 3px, D1, 4px, speed are reported.

Change estimator args. for different disparity estimators:

softargmax: soft-argmax
argmax: argmax
SME: Single-Modal disparity Estimator
DME: Dominant-Modal disparity Estimator

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
backbones		backbones
datalists		datalists
datasets		datasets
disparity_estimators		disparity_estimators
losses		losses
scripts		scripts
README.md		README.md
train_DDP.py		train_DDP.py
val.py		val.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

backbones

backbones

datalists

datalists

datasets

datasets

disparity_estimators

disparity_estimators

losses

losses

scripts

scripts

README.md

README.md

train_DDP.py

train_DDP.py

val.py

val.py

Repository files navigation

Adaptive Multi-Modal Cross-Entropy Loss for Stereo Matching

arXiv | Poster | Video

Abstract

Environment

Datasets

Training

Evaluation

About

Releases

Packages

Languages

xxxupeng/ADL

Folders and files

Latest commit

History

Repository files navigation

Adaptive Multi-Modal Cross-Entropy Loss for Stereo Matching

arXiv | Poster | Video

Abstract

Environment

Datasets

Training

Evaluation

About

Topics

Resources

Stars

Watchers

Forks

Languages