You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I understand it correctly, both SAM and SAM-HQ takes box points, input points and labels (text) as input along with the input image.
What about the input images that we don't have these information available for?
If we want to take the human completely out from the scenario and would want the model to take the input and predict the mask, what changes do we need to make to the model?
The text was updated successfully, but these errors were encountered:
if I'm not mistaken, you need the prompt encoder to determine embeddings on an image to mask. automask generator actually is a bit misleading as it just generates a point prompt every 20 pixels or so. For each point prompt, embeddings are encoded and these are matched with the model on their IOU and the most probable (or top 3) is determined to be the masks.
If I understand it correctly, both SAM and SAM-HQ takes box points, input points and labels (text) as input along with the input image.
What about the input images that we don't have these information available for?
If we want to take the human completely out from the scenario and would want the model to take the input and predict the mask, what changes do we need to make to the model?
The text was updated successfully, but these errors were encountered: