Skip to content

Latest commit

 

History

History
29 lines (20 loc) · 1.98 KB

depth_from_one_line.md

File metadata and controls

29 lines (20 loc) · 1.98 KB

December 2019

tl;dr: Predict dense depth from one-line lidar guided by RGB image.

Overall impression

This paper proposes to "sculpt" the entire depth image from a reference while the original depth prediction task is to creating a depth value from the unknown. This makes the problem more tractable.

Monocular depth estimation is an ill-posed problem. Using sparse depth information can be helpful in solving the scale ambiguity. Refer to Deep Depth Completion of a Single RGB-D Imag and deep lidar and sparse to dense for depth completion from unstructured sparse data.

Similar idea has been used in Camera radar fusion Net.

Key ideas

  • For each point in the imputed laser scan, generate a line along the gravity direction in 3D, then projecting back to 2D. --> generating a vertical line directly should largely yield the same results.
  • Add the reference depth map to the network output to predict depth. This means the network only has to learn the residual depth.

Technical details

  • Interpolation is used to fill in the blanks in the horizontal direction before populating in the vertical one. This is potentially dangerous as it introduces spurious data point in mid-air.
  • Mixed classification and regression loss
    • multibin cls: the predicted value is with weighted average of all bins.
    • Softmax loss: when prediction falls into the correct bin, cls loss vanishes. This can be extended to cross entropy loss used in DC. $$L_c = \sum_{i=1}^{M}\sum_{k=1}^{K} \delta([y_i] - k_i) \log(p^k_i) = \sum_{i=1}^{M} \log p^{[y_i]}$$
    • regression with L1 loss.
    • for improved regression, see SMWA or DC

Notes

  • This idea of using complementary sensor information can be extended to depth prediction using radar and rgb image.