Detecting precision of values in lookup list #254

jhoetter · 2023-05-16T11:21:39Z

Is your feature request related to a problem? Please describe.
I love lookup lists, but i typically just have one lookuplist per label. When I collect all values in just one list, it can happen that a few values actually cause a bad performance. E.g. recently, I labeled the word "riot" to be negative, and then built a labeling function that looks for these words. because of "riot", i also hit words such as "patriot" (just an example).

In general, lookup lists don't always have a 100% precision. Adding words can make them worse, especially if they are very short and can be part of other words as well that have a different meaning.

Describe the solution you'd like
When I add new values to a lookup list, I'd love to see how precise the association of the value to the given label actually is on item-level. For instance, I want to see that "riot" has a precision of 0.5 in my lookup list.

In general, I just want to have some help that tells me if an item in a lookup list shouldn't be in there.

Describe alternatives you've considered
Just theoretically, I could add a labeling function for each and every item of the lookup list and thus calculate the stats. It's clear that I don't want to do that, especially because of the I/O.

What I could do, however, is to run an analysis on a lookup list on demand (e.g. when I actively request the calculation) that calculates the precision-stats for every item in a list individually, given that the label has the same name as the lookup list (or alternatively, i could enter the label stats i want to analyze). The computationally expensive part is not to run the stats individually, but it is to gather the data and put them in the containerized envs. So this should be possible afaik.

Additional context
-

jhoetter added the enhancement New feature or request label May 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detecting precision of values in lookup list #254

Detecting precision of values in lookup list #254

jhoetter commented May 16, 2023

Detecting precision of values in lookup list #254

Detecting precision of values in lookup list #254

Comments

jhoetter commented May 16, 2023