Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rectangular four vertex law #11

Open
HEUzhouhanwen opened this issue Jan 8, 2018 · 9 comments
Open

Rectangular four vertex law #11

HEUzhouhanwen opened this issue Jan 8, 2018 · 9 comments

Comments

@HEUzhouhanwen
Copy link

HEUzhouhanwen commented Jan 8, 2018

Hi!
I thought for a long time on this issue, did not think clearly
Rectangular four vertex law is not a law, such as clockwise, counterclockwise, or where the first vertex, these will not affect the results?
This function:
def bboxes_to_grasps(bboxes):
box = tf.unstack(bboxes, axis=1)
x = (box[0] + (box[4] - box[0])/2) * 0.35
y = (box[1] + (box[5] - box[1])/2) * 0.47
tan = (box[3] -box[1]) / (box[2] -box[0]) *0.47/0.35
h = tf.sqrt(tf.pow((box[2] -box[0])*0.35, 2) + tf.pow((box[3] -box[1])*0.47, 2))
w = tf.sqrt(tf.pow((box[6] -box[0])*0.35, 2) + tf.pow((box[7] -box[1])*0.47, 2))
return x, y, tan, h, w
Thank you!

@tnikolla
Copy link
Owner

Hi! Sorry, I did not understund the question.

@xiaoshuguo750
Copy link

Does four vertex of pos Rectangular have a law,?such as clockwise, counterclockwise, I think these will affect the results of the function!
function:
def bboxes_to_grasps(bboxes):
box = tf.unstack(bboxes, axis=1)
x = (box[0] + (box[4] - box[0])/2) * 0.35
y = (box[1] + (box[5] - box[1])/2) * 0.47
tan = (box[3] -box[1]) / (box[2] -box[0]) *0.47/0.35
h = tf.sqrt(tf.pow((box[2] -box[0])*0.35, 2) + tf.pow((box[3] -box[1])*0.47, 2))
w = tf.sqrt(tf.pow((box[6] -box[0])*0.35, 2) + tf.pow((box[7] -box[1])*0.47, 2))
return x, y, tan, h, w

@xiaoshuguo750
Copy link

default

@ahundt
Copy link

ahundt commented Jan 18, 2018

I believe the answer to your question is yes, your choice for encoding of grasps will affect the results. Object detection papers are a good source for this kind of information and different object detection algorithms use different box encodings.

image

Here is object detection code with different bounding boxes:
https://github.com/tensorflow/models/tree/master/research/object_detection

Here is the paper associated with the above link + image with details:
https://arxiv.org/abs/1611.10012

One difference for grasp encodings is they have an extra rotation parameter.

@ahundt
Copy link

ahundt commented Jan 24, 2018

I read their question again, and I think they're asking if theta is clockwise or counter-clockwise. As per the actual dataset readme:

3. Grasping rectangle files contain 4 lines for each rectangle. Each line
contains the x and y coordinate of a vertex of that rectangle separated by a space. The first two coordinates of a rectangle define the line
representing the orientation of the gripper plate. Vertices are listed in
counter-clockwise order.

@tnikolla I'm fairly certain there are a couple problems in the code leading to worse performance than expected because it is only reading the first positive bounding box, and no other bounding boxes.

@ahundt
Copy link

ahundt commented Feb 24, 2018

@tnikolla can you explain the constants 0.35 and 0.47?

They appear all over the place, such as in bboxes to grasps, grasp to bbox, and in the iou calculation.

@jmichaux
Copy link

@xiaoshuguo750 @ahundt Have either of you determined the proper encoding of the grasps? Also, there's no differences between the two sets of equations in @xiaoshuguo750's picture.

@Juzhan
Copy link

Juzhan commented Dec 13, 2018

I guess 0.35 and 0.47 are the scale factors of box width and box height, the size of image in cornell dataset is 640480, but the network's input size is 224244, after resize of the image, these bbox also need to be resize, so the scale factors are: 224/640.0=0.35, 224/480=0.47.

@ahundt
Copy link

ahundt commented Jan 14, 2019

This repository suffers from averaging all data, unfortunately. For example if there is a frisbee it will try to grab the center rather than the lid edge.

I've got improved code at https://github.com/jhu-lcsr/costar_plan which is good for classification of the cornell dataset, but a new cornell dataset training loop which gives credit on the smallest error grasp would be needed for regression to work well there.

Links to other recent papers are at https://github.com/ahundt/awesome-robotics/blob/master/papers.md.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants