-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX]: First attempt to get a classifiers plot function also build confusion matrices from a subset of labels #595
base: master
Are you sure you want to change the base?
Conversation
…tring. Disable error message for dimension mismatch between matrix and label lengths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the ultimate goal is somewhat reverse in this PR -- you are trying to plot subportion of the matrix (by providing subset of labels known to ConfusionMatrix), instead of needed for use "order" being a super set defining the order of the labels potentially with some which dataset at hands didn't have
@@ -969,6 +969,10 @@ def plot(self, labels=None, numbers=False, origin='upper', | |||
v = None | |||
if l is not None: v = labels_map_full[l] | |||
labels_plot += [v] | |||
elif (len(labels) != labels_order_filtered_set) and \ | |||
all(isinstance(x, str) for x in labels_order_filtered_set): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type shouldn't matter really - I might be having numeric labels
also, why not just to use sets operation to check exactly that labels_order_filtered_set
includes all we would like to plot like
all(isinstance(x, str) for x in labels_order_filtered_set): | |
elif not (set(labels) - labels_order_filtered_set): |
? because len
is too weak of a check -- I could have completely different labels from what is provided to labels_order and condition would still pass, but then you would end up with no labels_plot
"Number of labels %d doesn't correspond the size" + \ | ||
" of a confusion matrix %s" % (NlabelsNN, matrix.shape) | ||
# I believe this Error is superfluous -- or rather: wouldn't let | ||
# subsets of matrices be printed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want subsets of matrices printed? that is related to the comment above. It could be a feature indeed, but I think if desired, it should be explicit, like adding an additional parameter allow_partial=False
, so by default we would blow (so not to miss errors in specification) and only if explicitly instructed -- then to be allowed.
@@ -673,6 +673,26 @@ def test_confusionmatrix_nulldist(self): | |||
cvnp[(np.array([0,1]), np.array([1,0]))]) | |||
|
|||
|
|||
def test_plotting_subset(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if we do allow (optionally) the subset to be visualized -- then yeah, worth having this test
our use case though is different -- it is when we provide labels (to order) which has more labels than it is known to the ca.stats.labels (like we include "brain" in providing the order, but then we had no "brain" label within our dataset we worked on)
res = cv(ds) | ||
# get all labels | ||
labels = cv.ca.stats.labels | ||
cv.ca.stats.plot(labels=labels, numbers=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for unit testing ConfusionMatrix
there is no need to do full blown classification. Just create an instance of ConfusionMatrix (see other tests) and work with that.
This PR:
labels
parameter