Replicating SelectKBest + GridSearchCV results #19674

hex808080 · 2021-03-14T12:39:26Z

hex808080
Mar 14, 2021

I would like to be able to reproduce SelectKBest results when using GridSearchCV by performing the grid-search CV myself. However, I find my code to produce different results. Here is a reproducible example:

import numpy as np
from sklearn.datasets import make_classification
from sklearn.feature_selection import SelectKBest
from sklearn.model_selection import GridSearchCV, StratifiedKFold
from sklearn.metrics import roc_auc_score
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC
import itertools

r = 1
X, y = make_classification(n_samples = 50, n_features = 20, weights = [3/5], random_state = r)
np.random.seed(r)
X = np.random.rand(X.shape[0], X.shape[1])

K = [1,3,5]
C = [0.1,1]
cv = StratifiedKFold(n_splits = 10)

# SKLEARN GRID-SEARCH
space = dict()
space['anova__k'] = K
space['svc__C'] = C    
clf = Pipeline([('anova', SelectKBest()), ('svc', SVC(probability = True, random_state = r))])
search = GridSearchCV(clf, space, scoring = 'roc_auc', cv = cv, refit = True, n_jobs = -1)
result = search.fit(X, y)

print('GridSearchCV results:')
print(result.cv_results_['mean_test_score'])

# MANUAL GRID-SEARCH
scores = []
for train_indx, test_indx in cv.split(X, y):
    X_train, y_train = X[train_indx,:], y[train_indx]
    X_test, y_test = X[test_indx,:], y[test_indx]
    scores_ = []
    for k, c in itertools.product(K, C):
        anova = SelectKBest(k = k)
        X_train_k = anova.fit_transform(X_train, y_train)
        clf = SVC(C = c, probability = True, random_state = r).fit(X_train_k, y_train)
        y_pred = clf.predict_proba(anova.transform(X_test))[:, 1]
        scores_.append(roc_auc_score(y_test, y_pred))
    scores.append(scores_)
    
print('Manual grid-search CV results:')    
print(np.mean(np.array(scores), axis = 0))

For me, this produces the following output:

GridSearchCV results:
[0.41666667 0.4        0.4        0.4        0.21666667 0.26666667]
Manual grid-search CV results:
[0.58333333 0.6        0.53333333 0.46666667 0.48333333 0.5       ]

On the other hand, when commenting the np.random.rand line and using the original data, the results are compatible.

Is there something trivial I'm missing?
Is there a bug in my code that makes the two approaches match for "good" data, but not random data?
Is there some random process that I am not aware of underneath?

Over the course of 8 days, I have posted this very same question on:

Stack Overflow (link)
r/scikit_learn subreddit (link)
r/learnpython subreddit (link)
Crossvalidated Stack Exchange (link)

and I have received no answer whatsoever.

Is the question unclear?
Is it poorly worded?
Is the code provided not reproducible?
Should I ask it somewhere else?
Is there anything I can do to help getting an answer?

Answered by NicolasHug

Mar 15, 2021

The difference between the 2 snippets is in how auc is computed. I haven't doubled checked but I would bet that decision_function is used in GridSearch, while you're using calibrated probabilities (probabilities=True) in the for loop.

Change this:

        # y_pred = clf.predict_proba(anova.transform(X_test))[:, 1]
        y_pred = clf.decision_function(anova.transform(X_test))

and you'll get the same results.

For a better code: also remove probabilities=True everywhere (and also remove passing random_state to the estimators): you don't need the probabilities to compute the AUC, the output of the decision_function is enough.

View full answer

NicolasHug · 2021-03-15T08:58:39Z

NicolasHug
Mar 15, 2021
Maintainer

The difference between the 2 snippets is in how auc is computed. I haven't doubled checked but I would bet that decision_function is used in GridSearch, while you're using calibrated probabilities (probabilities=True) in the for loop.

Change this:

        # y_pred = clf.predict_proba(anova.transform(X_test))[:, 1]
        y_pred = clf.decision_function(anova.transform(X_test))

and you'll get the same results.

For a better code: also remove probabilities=True everywhere (and also remove passing random_state to the estimators): you don't need the probabilities to compute the AUC, the output of the decision_function is enough.

4 replies

lesteve Mar 15, 2021
Maintainer

I was not aware of probability=True for SVC but I think the key thing from https://scikit-learn.org/stable/modules/svm.html#scores-and-probabilities is:

the probability estimates may be inconsistent with the scores:

which explains that predict_proba and decision_function does not give the same AUC.

I guess if you really want to use the calibrated probabilities in your AUC, the SO answer https://stackoverflow.com/a/66630864 gives you a way to use a custom scorer for this with needs_proba=True.

hex808080 Mar 15, 2021
Author

I confirm that by making the substitution above, the same results are achieved.

However this creates another question: how is this not an issue?

Given the same data, the same estimator, the same pipeline and the same machine, shouldn't ROC-AUC scores be the same regardless of the method used to calculate them? Shouldn't I be able to compare a classifier ROC-AUC performance scores optimized through GridSearchCV with the results from the same (or another) classifier obtained via the predict_proba>roc_auc_score functions?

Alternatively: since the methods to calculate ROC-AUC apparently do not match, if I had to decide between the two methods I used in my code, how can I assess which one is the correct one?

lesteve Mar 15, 2021
Maintainer

However this creates another question: how is this not an issue?

This is what I tried to explain above basically SVC(probability=True) is special/weird in the sense that its predict_proba and decisision_function method don't give the same ordering between samples so not the same AUC. I was suprised by this and if this was me I would use CalibratedClassifierCV(SVC()) rather than SVC(probability=True) (they should be similar but maybe not exactly the same if IIUC the note in https://scikit-learn.org/stable/modules/svm.html#scores-and-probabilities) to clearly differentiate between the calibrated SVC and the non-calibrated SVC rather than having a single estimator where predict_proba and decision_function are inconsistent.

Alternatively: since the methods to calculate ROC-AUC apparently do not match, if I had to decide between the two methods I used in my code, how can I assess which one is the correct one?

Not an expert on calibrated probabilities but may be this can help:
https://scikit-learn.org/stable/modules/calibration.html#calibration

NicolasHug Mar 16, 2021
Maintainer

how is this not an issue?

Yeah, it's not great. The calibration is supposed to be a monotonic transformation, but it's not always the case with the builtin SVM calibration, even when there's no tie. With CalibratedClassifierCV you should get consistent results (modulo ties in the decision_function). BTW, there's also #19588 which is a bit weird.

Alternatively: since the methods to calculate ROC-AUC apparently do not match, if I had to decide between the two methods I used in my code, how can I assess which one is the correct one?

As the probabilities are a bi-product of the decision values (whether you use probability=True or CalibratedClassifierCV), I would recommend using the decision function if you just need the AUC.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replicating SelectKBest + GridSearchCV results #19674

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 4 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Replicating SelectKBest + GridSearchCV results #19674

hex808080 Mar 14, 2021

Replies: 1 comment · 4 replies

NicolasHug Mar 15, 2021 Maintainer

lesteve Mar 15, 2021 Maintainer

hex808080 Mar 15, 2021 Author

lesteve Mar 15, 2021 Maintainer

NicolasHug Mar 16, 2021 Maintainer

hex808080
Mar 14, 2021

Replies: 1 comment 4 replies

NicolasHug
Mar 15, 2021
Maintainer

lesteve Mar 15, 2021
Maintainer

hex808080 Mar 15, 2021
Author

lesteve Mar 15, 2021
Maintainer

NicolasHug Mar 16, 2021
Maintainer