PanopticBEV: Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

October 2021

tl;dr: Use of a vertical and a flat transformer to lift image into BEV.

Overall impression

The paper notes correctly that the notion of instance is critical to downstream. FIERY also extends the semantic segmentation idea to instance segmentation. Panoptic BEV goes one step further and does panoptic segmentation.

Key ideas

Backbone: resnet + BiFPN
At each level (P2-P5), image features are projected into a BEV by a dense transformer module (note that this transformer is not attention based).
Dense transformer module
- Each dense transformer consists of a distinct vertical and flat transformer.
- The vertical transformer uses a volumetric lattice to model the intermediate 3D space which is then flattened to generated the vertical BEV features. --> this vertical transformer is quite similar to that of Lift Splat Shoot.
- The flat transformer uses IPM followed by an Error Correction Module (ECM) to generate the flat BEV features.
- The differentiation between flat and vertical module is by a binary semantic segmentation network.

Technical details

Summary of technical details

Notes

Video
Code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

panoptic_bev.md

panoptic_bev.md

PanopticBEV: Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

Overall impression

Key ideas

Technical details

Notes

Files

panoptic_bev.md

Latest commit

History

panoptic_bev.md

File metadata and controls

PanopticBEV: Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View Images

Overall impression

Key ideas

Technical details

Notes