Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HSTree 'works' without error with unsupported sklearn models, but doesn't do anything #199

Open
bking124 opened this issue Jan 18, 2024 · 0 comments

Comments

@bking124
Copy link

First off, very much appreciate the package--there are a ton of interesting methods contained here! I was particularly interested in the Hierarchical Shrinkage trees, and had hoped they would be applicable to some of the other gradient boosting methods like LightGBM.

I realize now these are not supported. However, if a user tries to wrap a LGBM classifier with HSTree, they can do so. The fit and predict methods will work without error, but actually nothing changes to the underlying estimator. Similar points were made in #137 and #171. A reproducing example:

from sklearn import datasets
from sklearn.model_selection import train_test_split
from lightgbm import LGBMClassifier
from imodels import HSTreeClassifier
import numpy as np

np.random.seed(15)
X, y = datasets.load_breast_cancer(return_X_y=True)  # binary classification

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=10)
 
#Train LGBM 
myLGBM = LGBMClassifier()
myLGBM.fit(X_train, y_train)
 
#LGBM 'wrapped' by HSTree
myHST = HSTreeClassifier(LGBMClassifier(), reg_param=1000)
myHST.fit(X_train, y_train)
 
#Show they predict same probas
np.array_equal(myLGBM.predict_proba(X_test), myHST.predict_proba(X_test)) #true

I think this behavior is a bit misleading, and instead the user should be prompted with a warning/error saying the model is unsupported. From my understanding of the code, a model is unsupported if it contains neither a 'tree_' nor an 'estimators_' attribute, so a check for these attributes could perhaps be included as part of the initialization code. Please let me know if I've misunderstood anything!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant