Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YOLO ground truth width and length are not relative to image size but to S #140

Open
oonisim opened this issue Feb 26, 2023 · 0 comments
Open

Comments

@oonisim
Copy link

oonisim commented Feb 26, 2023

Code

dataset.py calculate thewidth_cell and height_cell to be set to the label_matrix Tensor.

"""
...
Then to find the width relative to the cell is simply:
width_pixels/cell_pixels, simplification leads to the
formulas below.
"""
width_cell, height_cell = (
    width * self.S,
    height * self.S,
)

Question

Please help understand why the unit of width_cell and width_cell are cells, that is, relative to S instead of image size.

In my understanding, width andheight are from the YOLO Darknet annotation where width and height are relative to the image size whose value is between 0 and 1. Suppose width=0.7, then width_cell will be 4.9 cells.

If width_cell and width_cell are used as the ground truth for YOLO v1 training, I suppose they should be relative to image size as in the YOLO v1 paper.

Each bounding box consists of 5 predictions: x, y, w, h,
and confidence. The (x; y) coordinates represent the center
of the box relative to the bounds of the grid cell. The width
and height are predicted relative to the whole image
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant