Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve search widget for image data #11

Open
phurwicz opened this issue Dec 26, 2020 · 7 comments
Open

Improve search widget for image data #11

phurwicz opened this issue Dec 26, 2020 · 7 comments
Labels
enhancement New feature or request

Comments

@phurwicz
Copy link
Owner

phurwicz commented Dec 26, 2020

Feb 24 2022: currently using vector search.


Image search is rather unclear compared with text. Things to consider:

  • what kind of Bokeh widget to use. Could be FileInput for uploads or TextInput for urls
  • what kind of search is appropriate (as in, making semantic sense and trying to be independent from the vectorizer/dimensionality reduction)
@phurwicz phurwicz added help wanted Extra attention is needed enhancement New feature or request labels Dec 26, 2020
@phurwicz
Copy link
Owner Author

More on image matching

@phurwicz
Copy link
Owner Author

Work in progress: first implement a vector-based search. Then try to upgrade to structural similarity.
If structural similarity works well, we can do the same for audio in its MFCC format.

@phurwicz phurwicz removed the help wanted Extra attention is needed label Feb 21, 2022
@phurwicz phurwicz changed the title Add search widget for image data Improve search widget for image data Feb 25, 2022
@haochuanwei
Copy link
Collaborator

Update: consider an abstract vector search engine, or an abstract similarity-based search engine.

@FlorianBertonBrightClue

Hi, I'm using hover to annote a dataset of image, and I encounter a small issue regarding the visualization of the image.

If the resolution is higher than the row width or row height define for the table we don't visualize very well.

To fix this you can just change in hover/core/local_config.py Line 60:
template="""<img src=<%= value %>>""",
To:
template="""<img src=<%= value % width="200" height="200">>""",

It will automatically resize the image in a 200x200 and so you can see image with for instance a resolution of 720x720

I can do a PR for that if you want but the modifications is very small

@haochuanwei
Copy link
Collaborator

Hi, @FlorianBertonBrightClue thank you for using hover and bringing this up!

In some use cases 200x200 can be difficult to see clearly. Actually we can make it configurable on the user side.

The code below should work for the upcoming 0.8.0 version, which is likely within a week.

hover.config["visual"]["table_image_width"] = 200
hover.config["visual"]["table_image_height"] = 200

Does this look good?

@FlorianBertonBrightClue

And so at line 57/58 in local.config.py, you will put this ?
feature_col_kwargs["formatter"] = HTMLTemplateFormatter(
template=f'<img src=<%= value %> width="{hover.config["visual"]["table_image_width"]}" height="{hover.config["visual"]["table_image_height"]} >',
)

If yes, it should work and the user could configure it as ha wants by setting the hover.config.

I also have two questions for you :

  • It seems for now that we can't set the parameters for the DimensionalityReducer, would it be possible later ? Like for instance in umap you can choose the number of neighbor or the minimum distance.

  • Can we change the label for data that are already in train ? In my case I did some prelabelling and sometimes I want to change the label because two clusters are close one to each other and finally and want to merge them together

@haochuanwei
Copy link
Collaborator

haochuanwei commented Jan 20, 2023

And so at line 57/58 in local.config.py, you will put this ? feature_col_kwargs["formatter"] = HTMLTemplateFormatter( template=f'<img src=<%= value %> width="{hover.config["visual"]["table_image_width"]}" height="{hover.config["visual"]["table_image_height"]} >', )

Basically yes. This line reads the config only once though, so be sure to configure immediately after import hover.

Customize DimensionalityReducer

  • It seems for now that we can't set the parameters for the DimensionalityReducer, would it be possible later ? Like for instance in umap you can choose the number of neighbor or the minimum distance.

Technically you can. With dataset.compute_nd_embedding() you can pass in keyword arguments that umap accepts. Hover attempts to “translate” crucial kwargs for compatibility (like “dimension” to different equivalents in umap and ivis) but will forward the rest.

This could be much better documented though.

Edit committed labels

  • Can we change the label for data that are already in train ? In my case I did some prelabelling and sometimes I want to change the label because two clusters are close one to each other and finally and want to merge them together

You can do this a few ways depending on which one is convenient:

  • in the selection table (the one where large images don’t show well right now), make edits in the label column and save the edits.
  • access the underlying dataframe with dataset.dfs["train"].
  • export to file, edit and load back.

You cannot change train labels directly the same way you label raw data in the scatter plot. “Commit” locks in the subset and label unless you take the “backdoors” above. This is to prevent mis-relabeling labeled data that happen to be (often for good reasons) mixed into in a selection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants