Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get access to 2d bounding boxes with respect to camera #3

Open
gianscarpe opened this issue May 11, 2022 · 3 comments
Open

Get access to 2d bounding boxes with respect to camera #3

gianscarpe opened this issue May 11, 2022 · 3 comments

Comments

@gianscarpe
Copy link

Hi,
thanks for your amazing work! I have a potentially silly question to ask. I need to access the 2D bounding-boxes for each camera (front, left, right, back) but I could only find the bounding-boxes for the equirectangular image. I access the 2d boxes with:

token = "'tr1thGb4-HK8yPOzSZFHQQ-cam-front"
boxes = met.get_sample_data(token, get_all_visible_boxes=False)[2]

The result is a list of EquiBox2d. Do they refer to the projection of the bounding-boxes onto a specific camera (e.g., the frontal one)?
My second questions regards the content of the points attribute. It's a numpy list of shape (80x2). Are these all the points of the bounding box (and in this case, the four corners are the actual coordinates of the bounding box)?
Thanks in advance! :)

@ducksoup
Copy link
Contributor

Hi @gianscarpe , you are already on the right track!

2D bounding boxes in Metropolis are annotated on the equirectangular images, which means that their edges map to curves, and not to straight lines, when seen from the perspective images. This is represented in the SDK by the EquiBox2d class. EquiBox2d is a discretized representation of one of these "deformed" boxes, where the points attribute contains its boundary expressed as a polygon. If you want to get a standard, axis-aligned box out of this, the easiest way would be to compute the bounding box of the points.

Note: this approach will give you bounding boxes that are not tight around the objects. Unfortunately, there's no way to obtain tight boxes on the perspective images given tight boxes on the equirectangular images without human re-annotation.

@gianscarpe
Copy link
Author

Hi @ducksoup, thanks for your answer, it solved my problem! I have a couple of questions more. I noticed that the SDK provides the 2d bounding boxes only for a handful of classes, while the large majority of the categories are missing (e.g., buildings, vegetation, sidewalks). Is it intended to be? Another question regards the depth. I noticed that some objects (e.g., cars and pedestrians) are missing from the lidar data, I guess because of SFM estimation. Is it correct? I attach a couple of examples, thank you for your time! :)
image
image

@ducksoup
Copy link
Contributor

ducksoup commented May 12, 2022

To answer your questions:

  • Bounding boxes are provided only for "things", i.e. countable objects. These are the categories that were annotated by human annotators. Everything else ("stuff" classes, panoptic segmentations) are machine generated and do not include bounding boxes.
  • The depth images are generated using SfM, so they only capture the static, consistently 3d-reconstructable part of the scene. Note that annotated moving objects such as cars and pedestrians are explicitly excluded from the reconstruction process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants