[FIX]: First attempt to get a classifiers plot function also build confusion matrices from a subset of labels #595

adswa · 2018-12-14T15:48:46Z

This PR:

allows plotting of confusion matrices with a subset of labels of the original classification, if they are specified as strings within the labels parameter
disables plot() from blowing up if the length of provided labels does not match the dimensions of the matrix
add a small smoketest to see whether plotting of full or subsets of the labels blows up

…tring. Disable error message for dimension mismatch between matrix and label lengths

…ix blows

coveralls · 2018-12-14T16:05:00Z

Coverage decreased (-0.2%) to 80.035% when pulling 4452807 on AdinaWagner:confusion into fe0b765 on PyMVPA:master.

adswa · 2018-12-14T18:20:39Z

(@yarikoptic)

yarikoptic

I think the ultimate goal is somewhat reverse in this PR -- you are trying to plot subportion of the matrix (by providing subset of labels known to ConfusionMatrix), instead of needed for use "order" being a super set defining the order of the labels potentially with some which dataset at hands didn't have

yarikoptic · 2018-12-14T19:34:43Z

mvpa2/clfs/transerror.py

@@ -969,6 +969,10 @@ def plot(self, labels=None, numbers=False, origin='upper',
                    v = None
                    if l is not None: v = labels_map_full[l]
                    labels_plot += [v]
+            elif (len(labels) != labels_order_filtered_set) and \
+                    all(isinstance(x, str) for x in labels_order_filtered_set):


type shouldn't matter really - I might be having numeric labels
also, why not just to use sets operation to check exactly that labels_order_filtered_set includes all we would like to plot like

Suggested change

all(isinstance(x, str) for x in labels_order_filtered_set):

elif not (set(labels) - labels_order_filtered_set):

? because len is too weak of a check -- I could have completely different labels from what is provided to labels_order and condition would still pass, but then you would end up with no labels_plot

yarikoptic · 2018-12-14T19:36:09Z

mvpa2/clfs/transerror.py

-                  "Number of labels %d doesn't correspond the size" + \
-                  " of a confusion matrix %s" % (NlabelsNN, matrix.shape)
+        # I believe this Error is superfluous -- or rather: wouldn't let
+        # subsets of matrices be printed.


do we want subsets of matrices printed? that is related to the comment above. It could be a feature indeed, but I think if desired, it should be explicit, like adding an additional parameter allow_partial=False, so by default we would blow (so not to miss errors in specification) and only if explicitly instructed -- then to be allowed.

yarikoptic · 2018-12-14T19:38:08Z

mvpa2/tests/test_transerror.py

@@ -673,6 +673,26 @@ def test_confusionmatrix_nulldist(self):
                                      cvnp[(np.array([0,1]), np.array([1,0]))])


+def test_plotting_subset():


so if we do allow (optionally) the subset to be visualized -- then yeah, worth having this test

our use case though is different -- it is when we provide labels (to order) which has more labels than it is known to the ca.stats.labels (like we include "brain" in providing the order, but then we had no "brain" label within our dataset we worked on)

yarikoptic · 2018-12-14T19:38:51Z

mvpa2/tests/test_transerror.py

+    res = cv(ds)
+    # get all labels
+    labels = cv.ca.stats.labels
+    cv.ca.stats.plot(labels=labels, numbers=True)


for unit testing ConfusionMatrix there is no need to do full blown classification. Just create an instance of ConfusionMatrix (see other tests) and work with that.

adswa added 2 commits December 14, 2018 10:42

enable subset of matrix to be plotted if label names are given as a s…

de53286

…tring. Disable error message for dimension mismatch between matrix and label lengths

add a smoketest to see whether plotting of full- or subsets of a matr…

4452807

…ix blows

yarikoptic requested changes Dec 14, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIX]: First attempt to get a classifiers plot function also build confusion matrices from a subset of labels #595

[FIX]: First attempt to get a classifiers plot function also build confusion matrices from a subset of labels #595

adswa commented Dec 14, 2018

coveralls commented Dec 14, 2018

adswa commented Dec 14, 2018

yarikoptic left a comment

yarikoptic Dec 14, 2018

yarikoptic Dec 14, 2018

yarikoptic Dec 14, 2018

yarikoptic Dec 14, 2018

	all(isinstance(x, str) for x in labels_order_filtered_set):
	elif not (set(labels) - labels_order_filtered_set):

		@@ -673,6 +673,26 @@ def test_confusionmatrix_nulldist(self):
		cvnp[(np.array([0,1]), np.array([1,0]))])


		def test_plotting_subset():

[FIX]: First attempt to get a classifiers plot function also build confusion matrices from a subset of labels #595

Are you sure you want to change the base?

[FIX]: First attempt to get a classifiers plot function also build confusion matrices from a subset of labels #595

Conversation

adswa commented Dec 14, 2018

coveralls commented Dec 14, 2018

adswa commented Dec 14, 2018

yarikoptic left a comment

Choose a reason for hiding this comment

yarikoptic Dec 14, 2018

Choose a reason for hiding this comment

yarikoptic Dec 14, 2018

Choose a reason for hiding this comment

yarikoptic Dec 14, 2018

Choose a reason for hiding this comment

yarikoptic Dec 14, 2018

Choose a reason for hiding this comment