Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it easier to label to another category #469

Open
yannisk2 opened this issue Mar 30, 2023 · 6 comments
Open

Make it easier to label to another category #469

yannisk2 opened this issue Mar 30, 2023 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@yannisk2
Copy link
Collaborator

Is your feature request related to a problem?
When a user labels to another category, they have to open the other categories dropdown and then move the mouse:

  • first vertically to the desired category and
  • then horizontally to the checkmark/X icon shown on the category's row to assign a positive/negative label, respectively.

This combined movement can make labeling to other categories a bit slow.

What is the expected behavior?
To make this process more efficient, we could allow the user to label to a category by clicking anywhere on the row where the category name appears, thus avoiding the horizontal mouse movement. The obvious issue with this solution is that we want to allow the user to assign both positive and negative labels. However, since we expect positive labels to be used significantly more often in the label to other category function than negative labels, we could give priority to the positive label.

There are a few different ways to implement this. I am adding here a couple of options that we have discussed in the past:

  • When the user clicks on a category, the system cycles through positive, negative, and no label (in this order). This allows clicking on a category to be used both for positive and negative labels (while giving priority to positive labels).
  • When the user clicks on a category, the system cycles between positive and no label. This essentially treats clicking on a category as a shortcut for positive labels only.

Note that the above does not suggest removing the checkmark/X icons from the label to another category list. We can still keep the icons to provide a similar UX to the current labeling within a category, while providing additional shortcuts to make labeling to another category faster.

@yannisk2 yannisk2 added the enhancement New feature or request label Mar 30, 2023
@martinscooper
Copy link
Member

@yannisk2 Good suggesstion. I agree on adding this sort of shortcut. I think I prefer the first item suggestion regarding what clicking a category several times does.

The only concern I have is that sometimes the progress labeling bar continues to increase when changing an element's label. So, I think that if the user prefers to label an element for another category as negarive by using the two-clicks instead of clicking the cross button it may be a problem, because the status bar will increase even if the user didn't actually do the first click to actually label it as positive.

As a third option that I just thought about: what about 1 click for positive label and 2 consecutive quick clicks for negative label?

@yannisk2
Copy link
Collaborator Author

As a third option that I just thought about: what about 1 click for positive label and 2 consecutive quick clicks for negative label?

Regarding the third option, as we briefly discussed offline, I am worried that double-clicking may not be obvious to users.

I think I prefer the first item suggestion regarding what clicking a category several times does.

The only concern I have is that sometimes the progress labeling bar continues to increase when changing an element's label. So, I think that if the user prefers to label an element for another category as negarive by using the two-clicks instead of clicking the cross button it may be a problem, because the status bar will increase even if the user didn't actually do the first click to actually label it as positive.

That is a great point! I am wondering though whether this is something that should be improved in the logic that counts the number of modified labels, so that the number of modified labels does not depend so much on the exact sequence of labeling actions that a user goes through until they arrive at the final label for an element.

For instance, we could change the semantics of the modified labels so that a label of an element is considered as modified only if it corresponds to a change from the label of the element in the previous iteration. With this proposed definition, if an element was initially unlabeled at iteration I and at the next iteration I+1 it is labeled first as positive and then as negative, this will be counted as a single label change (instead of the two label changes that it would be considered based on the current logic). I think that this will give us two advantages: (a) it will make the changed label logic more intuitive and (b) it will give us the freedom to choose an intuitive UX for the label to a different category shortcut discussed in this issue.

@martinscooper @shnarch @alonh @arielge What do you think about this? If you agree with the idea of changing the logic that counts the number of changed labels, I can create a separate issue to discuss this change in more detail.

@martinscooper martinscooper self-assigned this May 12, 2023
@alonh
Copy link
Contributor

alonh commented May 16, 2023

@martinscooper @yannisk2 Today, all we need to store is a single integer counting the number of changes since the last model was trained. Do you have an idea for how to accomplish the desired behavior without drastically increasing the amount of data we save?

@martinscooper
Copy link
Member

@alonh as I imagine this, the amount of data to save would be a dict with max size changed_element_threshold which keys would be the uris of the elements and the values a tuple with the starting label value (from last iteration) and the changed one. If apply_labels_to_duplicate_texts is True we would have to store the text instead of the element uris as keys. The method would be to update this dict each time a label is applied, if the tuple of an entry has two equal values, its entry is removed (no changes). The number of changed elements is the size of the dict. Do you consider this a drastic increase of data saved?

@martinscooper
Copy link
Member

@alonh I actually don't understand yet how the mechanism works actually. For example, in the video attached, If only an integer is stored, why is the progress labeling indicator decreasing in this case?

Screen.Recording.2023-05-16.at.12.01.47.mov

@alonh
Copy link
Contributor

alonh commented May 31, 2023

since there is a minimum threshold on the # of positives (which is 20 by default), if you remove elements and left with less than 20, the progress bar is going backward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants