Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Is it SAM-HQ model applicable for predicting segmentation mask for the input images without boxes, point or label? #106

Open
mzg0108 opened this issue Dec 6, 2023 · 2 comments

Comments

@mzg0108
Copy link

mzg0108 commented Dec 6, 2023

If I understand it correctly, both SAM and SAM-HQ takes box points, input points and labels (text) as input along with the input image.
What about the input images that we don't have these information available for?

If we want to take the human completely out from the scenario and would want the model to take the input and predict the mask, what changes do we need to make to the model?

@lkeab
Copy link
Collaborator

lkeab commented Dec 9, 2023

we can use the everything mode as demonstrated here, which input uniform sampled points on the images as prompt.

@jez-moxmo
Copy link

if I'm not mistaken, you need the prompt encoder to determine embeddings on an image to mask. automask generator actually is a bit misleading as it just generates a point prompt every 20 pixels or so. For each point prompt, embeddings are encoded and these are matched with the model on their IOU and the most probable (or top 3) is determined to be the masks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants