Skip to content

vlainic/matthews-correlation-coefficient

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Matthews Correlation Coefficient (MCC)

MCC function for ML

Here I would like to share my implementation of Matthews Correlation Coefficient (MCC) for various situations.

Inspiration from Kaggle kernel by Michal on "Best loss function for F1-score metric".

Intro on MCC

I encountered MCC while search for the "best multi-class classification metric".

Wikipedia has very nice explanation of MCC, while at stats.stackexchange you can find a very interesting discussion on the topic. Multi-class MCC is often called "R_K statistics" so I found the whole page devoted to it.

The most useful expression for computation was Eq.(8) from the original article by Gorodkin:

where N is the number of examples, \tilde{C}_k is the kth row of the confusion matrix C, \hat{C}_l the lth column of C, C^T is C transposed and Tr(C) is the trace of C.

Note: if you only need MCC value to be computed use sklearn.metrics.matthews_corrcoef !!!

binary_mcc_loss.py

Function that can be used as loss function for Keras training in the binary classification case.

multi_mcc_loss.py

Function that can be used as loss function for Keras training in the multi-class classification case.

plot_mcc_vs_tresh.py

Following the practice of plotting precision/recall vs tresholds, I wanted to see how MCC behaves.