How does `vecstack.StackingTransformer` differ from `sklearn.ensemble.StackingClassifier`? #37

zachmayer · 2020-02-11T17:02:56Z

This might be useful to add to the readme

vecxoz · 2020-02-14T12:15:33Z

Hi,
Thanks for a good question!

Both classes are based on the same algorithm described in the paper Stacked Generalization by David H. Wolpert. But each of them has very different conceptual implementation and application. The most important difference is transformer vs. predictor architecture.

vecstack.StackingTransformer is a dedicated transformer which integrates classification and regression tasks in a single class. It outputs out-of-fold predictions and does not have predict method. Direct access to OOF is extremely important in stacking because it gives the ability to perform full cycle of analytics and optimization: compute correlations, optimize weights for average, etc. When everything is ready we can easily combine arbitrary number of stacking levels and final estimator using sklearn.pipeline.Pipeline.

sklearn.ensemble.StackingClassifier was designed as a predictor which outputs predictions from the final estimator. It does not give access to out-of-fold predictions for train set even though it has transform method. StackingClassifier.transform method can correctly transform only X_test because estimators used in this method are trained on full X_train while stacking requires cross-validation procedure to transform X_train.

Below I put together self-contained example which depicts the common ground between two implementations (where results are exactly the same). You can easily iterate over it to compare other different aspects which are important for your use cases:

import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import StackingClassifier
from sklearn.pipeline import Pipeline
from vecstack import StackingTransformer

X, y = make_classification(n_samples=500, n_features=5,
                           n_informative=3, n_redundant=1,
                           n_classes=3, flip_y=0, random_state=0)

X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                    test_size=0.2,
                                                    random_state=0)
    
estimators = [    
    ('et', ExtraTreesClassifier(n_estimators=100, random_state=0)),
    ('rf', RandomForestClassifier(n_estimators=100, random_state=0))]

final_estimator = LogisticRegression(random_state=0)

#-------------------------------------------------------------------------------
# vecstack.StackingTransformer
#-------------------------------------------------------------------------------

stack = StackingTransformer(estimators=estimators, 
                            regression=False, 
                            variant='B',
                            n_folds=5,
                            shuffle=False, 
                            stratified=True,
                            needs_proba=True)

steps = [('stack', stack),
         ('final_estimator', final_estimator)]

pipe = Pipeline(steps)

y_pred_vecstack = pipe.fit(X_train, y_train).predict_proba(X_test)

#-------------------------------------------------------------------------------
# sklearn.ensemble.StackingClassifier
#-------------------------------------------------------------------------------

clf = StackingClassifier(estimators=estimators,
                         final_estimator=final_estimator,
                         stack_method='predict_proba')

y_pred_sklearn = clf.fit(X_train, y_train).predict_proba(X_test)

print((y_pred_vecstack == y_pred_sklearn).all()) # True

#-------------------------------------------------------------------------------
# Compare transformation
#-------------------------------------------------------------------------------

S_test_vecstack = stack.transform(X_test)
S_test_sklearn = clf.transform(X_test)
print((S_test_vecstack == S_test_sklearn).all()) # True

S_train_vecstack = stack.transform(X_train)
S_train_sklearn = clf.transform(X_train)
print((S_train_vecstack == S_train_sklearn).all()) # False

et = ExtraTreesClassifier(random_state=0, n_estimators=100)
rf = RandomForestClassifier(random_state=0, n_estimators=100)
y_pred_et = et.fit(X_train, y_train).predict_proba(X_train)
y_pred_rf = rf.fit(X_train, y_train).predict_proba(X_train)
print((S_train_sklearn == np.hstack([y_pred_et, y_pred_rf])).all()) # True

zachmayer · 2020-02-14T15:08:05Z

This is really informative! Thank you for the writeup, and thank you for the great package! (I think your comment would be a great addition to the readme btw).

Have you considered adding you package to sklearn-contrib?

zachmayer · 2020-02-14T15:08:27Z

Direct access to OOF is extremely important in stacking <- this seems like the key point. I totally agree!

vecxoz · 2020-02-15T11:11:33Z

Thanks for your kind words!
Yes, OOF is all we need.
Actually I did not consider sklearn-contrib. I just want to be sklearn-compatible :)

vecxoz closed this as completed Feb 20, 2020

vecxoz mentioned this issue Apr 7, 2021

What is the difference between vecstack and sklearn StackingClassifier? #48

Closed

vecxoz pinned this issue Apr 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does `vecstack.StackingTransformer` differ from `sklearn.ensemble.StackingClassifier`? #37

How does `vecstack.StackingTransformer` differ from `sklearn.ensemble.StackingClassifier`? #37

zachmayer commented Feb 11, 2020

vecxoz commented Feb 14, 2020

zachmayer commented Feb 14, 2020

zachmayer commented Feb 14, 2020

vecxoz commented Feb 15, 2020

How does vecstack.StackingTransformer differ from sklearn.ensemble.StackingClassifier? #37

How does vecstack.StackingTransformer differ from sklearn.ensemble.StackingClassifier? #37

Comments

zachmayer commented Feb 11, 2020

vecxoz commented Feb 14, 2020

zachmayer commented Feb 14, 2020

zachmayer commented Feb 14, 2020

vecxoz commented Feb 15, 2020

How does `vecstack.StackingTransformer` differ from `sklearn.ensemble.StackingClassifier`? #37

How does `vecstack.StackingTransformer` differ from `sklearn.ensemble.StackingClassifier`? #37