Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Imps not storing with Forecaster.tune_test_forecast #85

Open
John-Miller12 opened this issue Oct 19, 2023 · 3 comments
Open

Feature Imps not storing with Forecaster.tune_test_forecast #85

John-Miller12 opened this issue Oct 19, 2023 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@John-Miller12
Copy link

Hello,

Thank you for your development and support of this valuable package.

I cannot get feature_importance=True and summary_stats=True to behave as expected. All models seem to be affected.

I ran --upgrade yesterday.

Environment:

  • Mac with M2 Pro and Sonoma 14.0
  • Jupyterlab running with AMD64 emulation on Docker - up-to-date :latest tag and no other package installs (I can think of, at least)

My forecaster objects are generated by:

import pandas as pd
import numpy as np
from scalecast.Forecaster import Forecaster
from scalecast import GridGenerator
from scalecast.multiseries import export_model_summaries
from datetime import datetime
import matplotlib.pyplot as plt
import seaborn as sns

forecasters= {}
for k,v in dfsd.items():
    f = Forecaster(y=v['GROSS_COUNT'],
                  current_dates = v.index)
    f.generate_future_dates(360)    
    f.set_test_length(.1)
    f.set_validation_length(28)
    f.add_seasonal_regressors(
        'dayofyear',
        'week',
        'month',
        'quarter',
        raw=False,
        sincos=True,
    )
    f.add_seasonal_regressors('year')
    f.add_time_trend()
    f.add_covid19_regressor(end=datetime(2022,2,1,0,0))
    forecasters[k] = f

tune_test_forecast is looped with:

for f in forecasters.values():
    f.tune_test_forecast(models, feature_importance=True, summary_stats=True)

The warnings/errors I get are below. Prophet verbose INFO has been removed:

/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2729: Warning: rf does not have summary stats.
  warnings.warn(
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2705: Warning: Cannot set pfi feature importance on rf. Here is the error: cannot import name 'if_delegate_has_method' from 'sklearn.utils.metaestimators' (/opt/conda/lib/python3.11/site-packages/sklearn/utils/metaestimators.py)
  warnings.warn(
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2729: Warning: gbt does not have summary stats.
  warnings.warn(
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2705: Warning: Cannot set pfi feature importance on gbt. Here is the error: cannot import name 'if_delegate_has_method' from 'sklearn.utils.metaestimators' (/opt/conda/lib/python3.11/site-packages/sklearn/utils/metaestimators.py)
  warnings.warn(
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2729: Warning: xgboost does not have summary stats.
  warnings.warn(
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2705: Warning: Cannot set pfi feature importance on xgboost. Here is the error: cannot import name 'if_delegate_has_method' from 'sklearn.utils.metaestimators' (/opt/conda/lib/python3.11/site-packages/sklearn/utils/metaestimators.py)
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2729: Warning: prophet does not have summary stats.
  warnings.warn(
/opt/conda/lib/python3.11/site-packages/scalecast/Forecaster.py:2705: Warning: Cannot set pfi feature importance on prophet. Here is the error: 'Forecaster' object has no attribute 'X'
  warnings.warn(

I would expect FI for most of these sklearn models. Can you please help me understand this miss?

@mikekeith52 mikekeith52 added the bug Something isn't working label Oct 20, 2023
@mikekeith52 mikekeith52 self-assigned this Oct 20, 2023
@mikekeith52
Copy link
Owner

That is not working as expected. Give me a little bit to look into it. Thanks for raising the issue!

@mikekeith52
Copy link
Owner

mikekeith52 commented Oct 23, 2023

After investigating, I am sure that the root of the problem is with the eli5 library. See this issue. I can't say for sure if the developers of that package will ever update it so that it works with newer versions of scikit-learn, so maybe a work-around is needed for scalecast. I'm not sure what that would be as scikit-learn 1.3.1 is needed to do some things in scalecast. If you need feature importance while I try to figure something out for this, you can try setting method = 'shap' when using feature importance, but I believe it only works for tree-based models right now.

mikekeith52 pushed a commit that referenced this issue Oct 27, 2023
- Added more feature importance options, all sourced through the shap library.
- shap is now a requirement and eli5 is not.
- Changed `Forecaster.reduce_Xvars()` to use only shap feature importance to rank features.
- Removed `fi_method` argument from `tune_test_forecast()`.
- Fixed how a pandas function was called that was raising a warning.
- Fixed feature importance to use shap only with TreeExplainer, PermutationExplainer, and other explainers (#85). See the [docs](https://scalecast.readthedocs.io/en/latest/Forecaster/Forecaster.html#src.scalecast.Forecaster.Forecaster.save_feature_importance) The eli5 package appears to be deprecated.
@mikekeith52
Copy link
Owner

The fix for this is in 0.19.4. See the new save_feature_importance() documentation. Feature importance has been expanded in the package to include the two types previously offered, plus five additional methods, all through the shap package.

If you agree with the fix, we will close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants