Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about Inference results #27

Open
Poodlee opened this issue Mar 22, 2024 · 3 comments
Open

Questions about Inference results #27

Poodlee opened this issue Mar 22, 2024 · 3 comments

Comments

@Poodlee
Copy link

Poodlee commented Mar 22, 2024

Hello. This time, I want to do 3D object detection of a cube with w, l, and h all of 0.04(m) through an rgb-d camera, so I am using this package. This is my first time doing this kind of work, Also, I'm still not good at English, so please understand that part. .

First, I create a data set and talk about the direction in which we proceed, and then ask questions at the end.

1. Understanding how sunrgbd data is processed and Make sunrgbd data

Label
In the tools/data_converter/sunrgbd_data_utils.py file, there is following code.

class SUNRGBDInstance(object):

    def __init__(self, line):
        data = line.split(' ')
        data[1:] = [float(x) for x in data[1:]]
        self.classname = data[0]
        self.xmin = data[1]
        self.ymin = data[2]
        self.xmax = data[1] + data[3]
        self.ymax = data[2] + data[4]
        self.box2d = np.array([self.xmin, self.ymin, self.xmax, self.ymax])
        self.centroid = np.array([data[5], data[6], data[7]])
        self.width = data[8]
        self.length = data[9]
        self.height = data[10]
        # data[9] is x_size (length), data[8] is y_size (width), data[10] is
        # z_size (height) in our depth coordinate system,
        # l corresponds to the size along the x axis
        self.size = np.array([data[9], data[8], data[10]]) * 2
        self.orientation = np.zeros((3, ))
        self.orientation[0] = data[11]
        self.orientation[1] = data[12]
        self.heading_angle = np.arctan2(self.orientation[1],
                                        self.orientation[0])
        self.box3d = np.concatenate(
            [self.centroid, self.size, self.heading_angle[None]])

As mentioned in the code above, labeling consists of a total of 13 items, and the following label .txt files were created accordingly. At this time, I thought I would only use the point cloud, so I arbitrarily entered the value 1 1 2 2 for the 2d bbox.
box 1 1 2 2 -0.024179 0.896166 0.111629 0.04 0.04 0.04 1.000000 0.000000

Depth
In the case of the depth file, the x, y, z, r, g, and b values were entered in that order. At this time, r, g, and b values were assigned values between 0 and 1.

2. Train

I ran a train according to the code below.

python tools/train.py configs/tr3d/tr3d_sunrgbd-3d-10class.py

3. Inference

I ran a Inference according to the code below

python demo/pcd_demo.py data/sunrgbd/points/000960.bin configs/tr3d/tr3d_sunrgbd-3d-10class.py work_dirs/surgbd-data/latest.pth --score-thr 0.6 --show

4. Problem in inference

Difference between label z value and inferred z value
+0 105
show_result
In the label file and in reality, the z value is 0.11, but the inference result value is 0.0759.
infer_bottom
Also, if I perform inference with a cube on the floor, I will even get a negative number.

So, I visualized and confirmed the point cloud used in the dataset. As a result, if you look at the photo below, you can see that it has a shape with a center of approximately 0.11.
pc vis

My question here is that the x and y coordinate values are pretty accurate, but I don't know why there is an error in the z value.

Yaw value
image (9)

Train was performed and when Infer was performed with the same data, an incomprehensible yaw value was obtained. The result is as follows, mid and max on the left mean distance, and the value on the right means rad (degree).

max_clock_15 -> 0.08926178 (5.11˚)
max_clock_30 -> -1.3506540 (-77.39˚)
max_clock_45 -> 1.5099705 (86.51˚)
max_clock_60 -> 1.34937334 (77.31˚)
max_clock_75 -> 1.3324995 (76.35˚)
max_straight -> 0.04222381 (2.42˚)
mid_clock_15 -> 1.5214607 (87.17˚)
mid_clock_30 -> 1.4977183 (85.81˚)
mid_clock_45 -> 1.5112293 (86.59˚)
mid_clock_60 -> 1.4744493
mid_clock_75 -> 1.5131842
mid_straight -> 1.4999024

There seems to be a problem with labeling. Can you explain in more detail the method when giving angle values within labeling?

If there is anything missing, please let me know. thank you

@filaPro
Copy link
Contributor

filaPro commented Mar 22, 2024

Does this comment help you? For the yaw angle it says that yaw=0 when it is oriented as x axis. And the center of the box is not in its actual center, but in the center of its bottom face.

@Poodlee
Copy link
Author

Poodlee commented Mar 22, 2024

Thank you for your kind and quick response.

  1. z-value
    Since the z value consistently differed from the actual z by about 2 - 2.5cm, I was worried about whether I should readjust the origin, but that problem was resolved by telling me about the bottom face. Thank you!

  2. yaw
    I've seen the comment you mentioned before, but I'm still not sure about the results. Using that comment, I set the right part as the +x axis, increased it clockwise by 15 degrees, and changed the label value the same to proceed with the training. After learning, the result comes out like this, as mentioned above.

result comes out like this, as mentioned above.
mid,max: distance       result(radian,degree)        label
max_clock_15             0.08926178 (5.11˚)           15˚
max_clock_30            -1.3506540 (-77.39˚)          30˚
max_clock_45             1.5099705 (86.51˚)           45˚
max_clock_60             1.34937334 (77.31˚)          60˚
max_clock_75             1.3324995 (76.35˚)           75˚
max_straight             0.04222381 (2.42˚)           0˚
mid_clock_15             1.5214607 (87.17˚)           15˚
mid_clock_30             1.4977183 (85.81˚)           30˚
mid_clock_45             1.5112293 (86.59˚)           45˚
mid_clock_60             1.4744493 (84.48˚)           60˚
mid_clock_75             1.5131842 (86.70˚)           75˚
mid_straight             1.4999024 (85.89˚)           0˚

(The cube below was trained by rotating it clockwise around the z-axis.)
IMG_5761

Since it is a cube, is it because rotation of 60 degrees in the clock direction and rotation of 30 degrees in the counter-clock direction are indistinguishable? Or was there a problem with labeling?

@filaPro
Copy link
Contributor

filaPro commented Mar 23, 2024

The prediction of angle may be not quite accurate. And also we don't make much difference between rotation on 90 degrees. You can see more details on angle question in FCAF3D paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants