Skip to content

AIVFI/Monocular-Depth-Estimation-Rankings-and-2D-to-3D-Video-Conversion-Rankings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 

Repository files navigation

Monocular Depth Estimation Rankings
and 2D to 3D Video Conversion Rankings

List of Rankings

Each ranking includes only the best model for one method.

Monocular Depth Estimation Rankings

  1. UnrealStereo4K (3840×2160): AbsRel<=0.04
  2. MVS-Synth (1920×1080): AbsRel<=0.06
  3. HRSD (1920×1080): AbsRel<=0.08
  4. Middlebury2021 (1920×1080): SqRel<=0.5
  5. NYU-Depth V2 (640×480): OPW<=0.31
  6. NYU-Depth V2 (640×480): AbsRel<=0.058

2D to 3D Video Conversion Rankings

I. Video Inpainting Rankings

  • (to do)

II. Light Field Video Reconstruction from Monocular Video Rankings

  1. 👑 4DLFVD with up to 10×10 real light field views✔️: LPIPS😍 (no data)
    This will be the King of all rankings. We look forward to ambitious researchers.
  2. 4DLFVD with up to 10×10 real light field views✔️: PSNR😞 (no data)
  3. Hybrid with 7×7 synthetic light field views✖️: LPIPS😍 (no data)
  4. Hybrid with 7×7 synthetic light field views✖️: PSNR😞>=32dB

Appendices


UnrealStereo4K (3840×2160): AbsRel<=0.04

RK     Model       AbsRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 ZoeDepth +PFR=128
arXiv
ENH:
CVPR
0.0388 {1}
arXiv
ENH:
UnrealStereo4K
GitHub Stars
ENH:
GitHub Stars
- -

Back to Top Back to the List of Rankings

MVS-Synth (1920×1080): AbsRel<=0.06

RK     Model       AbsRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 ZoeDepth +PFR=128
arXiv
ENH:
CVPR
0.0589 {1}
arXiv
ENH:
MVS-Synth
GitHub Stars
ENH:
GitHub Stars
- -

Back to Top Back to the List of Rankings

HRSD (1920×1080): AbsRel<=0.08

RK     Model       AbsRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 DPT-B + R + AL
ICCV
ENH:
CVPRW
0.074 {1}
CVPRW
ENH:
HRSD
GitHub Stars
ENH:
-
- -

Back to Top Back to the List of Rankings

Middlebury2021 (1920×1080): SqRel<=0.5

RK     Model       SqRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 LeReS-GBDMF
CVPR
ENH:
AAAI
0.444 {1}
AAAI
ENH:
HR-WSI
GitHub Stars
ENH:
GitHub Stars
- -

Back to Top Back to the List of Rankings

NYU-Depth V2 (640×480): OPW<=0.31

RK     Model       OPW ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 FutureDepth
arXiv
Backbone:
Swin-L
0.303 {4}
arXiv
NYU-Depth V2 - - -

Back to Top Back to the List of Rankings

NYU-Depth V2 (640×480): AbsRel<=0.058

RK     Model       AbsRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 Metric3D v2 CSTM_label
ICCV
ENH:
arXiv
Backbone:
DINOv2 with registers (ViT-L/14)
0.042 {1}
arXiv
DDAD & Lyft & Driving Stereo & DIML & Arogoverse2 & Cityscapes & DSEC & Mapillary PSD & Pandaset & UASOL & Virtual KITTI & Waymo & Matterport3d & Taskonomy & Replica & ScanNet & HM3d & Hypersim GitHub Stars - -
2 Depth Anything Large
CVPR
Backbone:
DINOv2 (ViT-L/14)
0.043 {1}
arXiv
Pretraining: BlendedMVS & DIML & HR-WSI & IRS & MegaDepth & TartanAir
Training: BDD100K & Google Landmarks & ImageNet-21K & LSUN & Objects365 & Open Images V7 & Places365 & SA-1B
GitHub Stars - -
3 MiDaS v3.1 BEiTL-512
TPAMI
ENH:
arXiv
Backbone:
BEiT512-L (ViT-L/16)
0.048 {1}
arXiv
Pretraining: ReDWeb & HR-WSI & BlendedMVS & NYU-Depth V2 & KITTI
Training: ReDWeb & DIML & 3D Movies & MegaDepth & WSVD & TartanAir & HR-WSI & ApolloScape & BlendedMVS & IRS & NYU-Depth V2 & KITTI
GitHub Stars - PyTorch
GitHub Stars
4 Marigold
CVPR
Backbone:
text-to-image LDM (Stable Diffusion v2)
0.055 {1}
arXiv
Hypersim & Virtual KITTI GitHub Stars - -
5 GenPercept
arXiv
Backbone:
VAE & U-Net (Stable Diffusion v2.1)
0.056 {1}
arXiv
Hypersim & Virtual KITTI GitHub Stars - -
6 NeWCRFs + LightedDepth
CVPR
ENH:
CVPR
0.057 {2}
CVPR
ENH:
NYU-Depth V2
GitHub Stars
ENH:
GitHub Stars
- -
7 UniDepth-V
CVPR
Backbone:
DINOv2 (ViT-L/14)
0.0578 {1}
arXiv
A2D2 & Argoverse2 & BDD100k & CityScapes & DrivingStereo & Mapillary PSD & ScanNet & Taskonomy & Waymo GitHub Stars - -

Back to Top Back to the List of Rankings

Hybrid with 7×7 synthetic light field views✖️: PSNR😞>=32dB

RK     Model        PSNR ↑   
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 LFVRT
ECCV
MDE: DPT
ICCV
Backbone:
ViT
32.66 {3+1D}
ECCV
GoPro & TAMULF GitHub Stars
MDE:
GitHub Stars
- -

📝 Note: The above ranking includes only one model, as the other methods are image-based and don't have any temporal information making them unsuitable for light field video reconstruction from monocular video.

Back to Top Back to the List of Rankings

Appendix 3: List of all research papers from the above rankings

Method Paper     Venue    
Depth Anything Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data CVPR
DPT Vision Transformers for Dense Prediction ICCV
FutureDepth FutureDepth: Learning to Predict the Future Improves Video Depth Estimation arXiv
GBDMF Multi-Resolution Monocular Depth Map Fusion by Self-Supervised Gradient-Based Composition AAAI
GenPercept Diffusion Models Trained with Large Data Are Transferable Visual Models arXiv
LeReS Learning to Recover 3D Scene Shape from a Single Image CVPR
LightedDepth LightedDepth: Video Depth Estimation in light of Limited Inference View Angles CVPR
LFVRT Synthesizing Light Field Video from Monocular Video ECCV
Marigold Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation CVPR
Metric3D Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image ICCV
Metric3D v2 Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation arXiv
MiDaS Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer TPAMI
MiDaS v3.1 MiDaS v3.1 – A Model Zoo for Robust Monocular Relative Depth Estimation arXiv
NeWCRFs Neural Window Fully-connected CRFs for Monocular Depth Estimation CVPR
PatchFusion PatchFusion: An End-to-End Tile-Based Framework for High-Resolution Monocular Metric Depth Estimation CVPR
R + AL High-Resolution Synthetic RGB-D Datasets for Monocular Depth Estimation CVPRW
UniDepth UniDepth: Universal Monocular Metric Depth Estimation CVPR
ZoeDepth ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth arXiv

Back to Top Back to the List of Rankings