Skip to content

gengshan-y/high-res-stereo

Repository files navigation

Hierarchical Deep Stereo Matching on High Resolution Images

Qualitative results on Middlebury:

Performance on Middlebury benchmark (y-axis: error, the lower the better):

Able to handle large view variation of high-res images (as a submodule in Open4D, CVPR 2020):

Requirements

  • tested with python 2.7.15 and 3.6.8
  • tested with pytorch 0.4.0, 0.4.1 and 1.0.0
  • a few packages need to be installed, for eamxple, texttable

Weights

Note: The .tar file can be directly loaded in pytorch. No need to uncompress it.

Inference

Test on CrusadeP and dancing stereo pairs:

CUDA_VISIBLE_DEVICES=3 python submission.py --datapath ./data-mbtest/   --outdir ./mboutput --loadmodel ./weights/final-768px.tar  --testres 1 --clean 1.0 --max_disp -1

Evaluate on Middlebury additional images:

CUDA_VISIBLE_DEVICES=3 python submission.py --datapath ./path_to_additional_images   --outdir ./output --loadmodel ./weights/final-768px.tar  --testres 0.5
python eval_mb.py --indir ./output --gtdir ./groundtruth_path

Evaluate on HRRS:

CUDA_VISIBLE_DEVICES=3 python submission.py --datapath ./data-HRRS/   --outdir ./output --loadmodel ./weights/final-768px.tar  --testres 0.5
python eval_disp.py --indir ./output --gtdir ./data-HRRS/

And use cvkit to visualize in 3D.

Example outputs

left image

3D projection

disparity map

uncertainty map (brighter->higher uncertainty)

Parameters

  • testres: 1 is full resolution, and 0.5 is half resolution, and so on
  • max_disp: maximum disparity range to search
  • clean: threshold of cleaning. clean=0 means removing all the pixels.

Data

train/val

test

High-res-real-stereo (HR-RS) It has been taken off due to licensing issue. Please use the Argoverse dataset.

Train

  1. Download and extract training data in folder /d/. Training data include Middlebury train set, HR-VS, KITTI-12/15, ETH3D, and SceneFlow.
  2. Run
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --maxdisp 384 --batchsize 28 --database /d/ --logname log1 --savemodel /somewhere/  --epochs 10
  1. Evalute on Middlebury additional images and KITTI validation set. After 40k iterations, average error on Middlebury additional images excluding Shopvac (perfect+imperfect, 24 stereo pairs in total) with half-res should be around 5.7.

Citation

@InProceedings{yang2019hsm,
author = {Yang, Gengshan and Manela, Joshua and Happold, Michael and Ramanan, Deva},
title = {Hierarchical Deep Stereo Matching on High-Resolution Images},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}

Acknowledgement

Part of the code is borrowed from MiddEval-SDK, PSMNet, FlowNetPytorch and pytorch-semseg. Thanks SorcererX for fixing version compatibility issues.

About

Hierarchical Deep Stereo Matching on High Resolution Images, CVPR 2019.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages