Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-categorical confusion matrix calculation for labels not presented in predicted_labels #138

Open
xiaoyi-cheng opened this issue Mar 4, 2023 · 3 comments

Comments

@xiaoyi-cheng
Copy link
Contributor

Feedback from Bilal from a PR review: #136 (comment)

How are we supposed to handle cases when a predicted label (in this case 2) is not present in the observed labels (in this case [1])? Some options are:

We limit the confusion matrix CM to labels are present in both observed (label_series) and predicted labels (predicted_label_series). This is what sklearn does.
CM contains labels from the union of observed and predicted labels.
CM contains labels from observed labels only. If a predicted label is not found in observed labels, we raise an error saying something like "Unknown label 2".
I think we should pick option 3 since it assumes that observed labels provide us a complete list of all the possible labels. Option 1 could be problematic because it will drop some valid observed labels in case they are not found in predicted labels.

If we opt for 3, we should raise an error in this line.

Need to figure out if we want to handle this from analyzer side or library.

@goswamig
Copy link
Member

goswamig commented Mar 6, 2023

CC @bilalaws

@goswamig
Copy link
Member

goswamig commented Mar 6, 2023

We need to dive deep in container and see if there are better options than mentioned above.

@goswamig
Copy link
Member

goswamig commented Mar 6, 2023

@bilalaws do we have data on how often this use case will be hit by customer ?
when a predicted label (in this case 2) is not present in the observed labels (in this case [1])? ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants