Precision Recall curve for softmax output and 0 label for positive class #19453

desch142 · 2021-02-13T13:11:18Z

desch142
Feb 13, 2021

Hi everybody,
i want to evaluate my model using the precision recall scores, because my data is unbalanced. Since I have a binary classification I am using a softmax at the end of my NN.
The output scores and true labels look something like :

y_score = [[0.4, 0.6],
           [0.6, 0.4],
           [0.3, 0.7],
              ...   ]
y_true = [1,
          0,
          0,
         ...]

Where y_score[:,0] corresponds to the probability of class 0.
My positive labels are 0 and thus the negative labels are 1 in my case.

Since my dataset is unbalanded (more negatives than positives) I want to use the precision recall score (AUPRC) to evaluate my classifier. The function sklearn.metrics.precision_recall_curve takes a parameter pos_label, which I would set to pos_label = 0. But the parameter probas_pred
takes an ndarray of probabilities of shape (n_samples,).

My question is, which of my y_score column should I take for probas_pred since I set pos_label = 0 and why?

I hope my question is clear.
Thank you in advance!

NicolasHug · 2021-02-15T08:43:49Z

NicolasHug
Feb 15, 2021
Maintainer

probas_pred should be the probabilities of the positive class, so y_score[:, 0] in your case.

why?

By convention : in binary classification, our models always return only one probability when calling predict_proba: the probability of the positive class (the proba of the negative class is just 1 - predict_proba). So for consistency and ease of use across the library, if something accepts probabilities and its shape is n_samples, these will be the probabilities of the positive class.

My positive labels are 0 and thus the negative labels are 1 in my case.

You might want to switch your convention to that of scikit-learn to avoid a lot of headache ;)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Precision Recall curve for softmax output and 0 label for positive class #19453

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Precision Recall curve for softmax output and 0 label for positive class #19453

desch142 Feb 13, 2021

Replies: 1 comment

NicolasHug Feb 15, 2021 Maintainer

desch142
Feb 13, 2021

NicolasHug
Feb 15, 2021
Maintainer