Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving rendered 3d bounding box coordinates in the image space (pixel coordinates) from ZED SDK? #457

Open
2 tasks done
harishkool opened this issue Jan 13, 2022 · 3 comments

Comments

@harishkool
Copy link

Preliminary Checks

  • This issue is not a duplicate. Before opening a new issue, please search existing issues.
  • This issue is not a question, bug report, or anything other than a feature request directly related to this project.

Proposal

I am following the example shown in zed-examples/object detection/image viewer at master · stereolabs/zed-examples · GitHub. It uses OpenGL to draw objects on the image. Right now the 3D bounding box coordinates from the ZED SDK are normalized, I think it would be great if ZED provides the feasibility of returning the 3D bounding box coordinates in the pixel space by taking projection matrix and image shape as the input. I am doing like below to get the 3D bounding box coordinates in the image space i.e., pixel coordinates

        bbox = objects.object_list[i].bounding_box
    #     _cam_mat = np.array(_cam, np.float32).reshape(4,4)
        N = 8
        hom_obj_coords = np.c_[bbox, np.ones(N)]
        proj3D_cam = np.matmul(hom_obj_coords, _cam_mat) # 8 x 4
        # proj3D_cam[1] = proj3D_cam[1] + 0.25

        # proj2D = [((proj3D_cam[0] / pt4d[3]) * _wnd_size.width) / (2. * proj3D_cam[3]) + (_wnd_size.width * 0.5)
        # , ((proj3D_cam[1] / pt4d[3]) * _wnd_size.height) / (2. * proj3D_cam[3]) + (_wnd_size.height * 0.5)]

        proj2D = [((proj3D_cam[:, 0] / hom_obj_coords[:, 3]) * 1920) / (2. * proj3D_cam[:, 3]) + (1920 * 0.5)
                , ((proj3D_cam[:, 1] / hom_obj_coords[:, 3]) * 1172) / (2. * proj3D_cam[:, 3]) + ((1172 * 0.5) + ((1172 - 1080)*0.5))]
        proj2D_x = proj2D[0]
        proj2D_y = proj2D[1]

where

_cam_mat

is the projection matrix I got from the OpenGL code. But objects are not getting aligned properly, I think it would be great if the ZED SDK provides the support for this.

Use-Case

Saving the 3D bounding boxes in the pixel space will help to train any custom 3D object detection network without any associated point clouds.

Anything else?

No response

@obraun-sl
Copy link
Member

Hi,

Best is to use the OpenCV projectPoints function as it is made for that :
https://docs.opencv.org/3.4/d9/d0c/group__calib3d.html#ga1019495a2c8d1743ed5cc23fa0daff8c

The cameraMatrix is given with CameraInformation().calibration_parameters and R,T is the pose of the camera (if necessary).

@fennecinspace
Copy link

fennecinspace commented Jan 27, 2023

@obraun-sl I have tried doing this, but the rotation of the resulted bounding boxes is weird.
@harishkool please share if you've found a solution to fix the wonky boxes.

@tavasolireza
Copy link

@fennecinspace Did you eventually solve this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants