Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New error message from established workflow: AttributeError: 'Pipeline' object has no attribute '_check_fit_params'[BUG] #1078

Open
dysartk opened this issue Apr 17, 2024 · 7 comments

Comments

@dysartk
Copy link

dysartk commented Apr 17, 2024

Describe the bug

I received this error message when attempting to run an established workflow that is dependent on creating a pipeline, applying SMOTE, and then using GridSearchCV to tune the hyperparameters. I've replicated the error when running the code on both a machine running the latest version of PopOS and Mac OS X. Python version 3.11.5. I've pasted a sample code below, generated with Copilot, that replicates the exact error and pipeline build. This code returned no errors in the last 24 hours from this post.

Steps/Code to Reproduce

from imblearn.over_sampling import SMOTE
from imblearn.pipeline import Pipeline
from sklearn import svm
from sklearn.datasets import make_classification
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.metrics import classification_report

# Create a binary classification dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=2, n_redundant=10, n_clusters_per_class=1, weights=[0.99], flip_y=0, random_state=1)

# Split the dataset into training set and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define pipeline
model = svm.SVC()
resampling = SMOTE(sampling_strategy=0.5) # SMOTE happens during Cross Validation not before..
pipeline = Pipeline([('SMOTE', resampling), ('SVM', model)])

# Define parameter grid
param_grid = [
    {'SVM__C': [1, 10, 100, 1000], 'SVM__kernel': ['linear']},
    {'SVM__C': [1, 10, 100, 1000], 'SVM__gamma': [0.001, 0.0001], 'SVM__kernel': ['rbf']},
]

# Apply Grid Search
grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

# Print the best parameters
print("Best parameters: ", grid_search.best_params_)

# Predict on the test set
y_pred = grid_search.predict(X_test)

# Print the classification report
print(classification_report(y_test, y_pred))

Error Message

ValueError                                Traceback (most recent call last)
Cell In[1], [line 27](vscode-notebook-cell:?execution_count=1&line=27)
     [25](vscode-notebook-cell:?execution_count=1&line=25) # Apply Grid Search
     [26](vscode-notebook-cell:?execution_count=1&line=26) grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring='accuracy')
---> [27](vscode-notebook-cell:?execution_count=1&line=27) grid_search.fit(X_train, y_train)
     [29](vscode-notebook-cell:?execution_count=1&line=29) # Print the best parameters
     [30](vscode-notebook-cell:?execution_count=1&line=30) print("Best parameters: ", grid_search.best_params_)

File [~/anaconda3/lib/python3.11/site-packages/sklearn/base.py:1474](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/base.py:1474), in _fit_context.<locals>.decorator.<locals>.wrapper(estimator, *args, **kwargs)
   [1467](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/base.py:1467)     estimator._validate_params()
   [1469](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/base.py:1469) with config_context(
   [1470](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/base.py:1470)     skip_parameter_validation=(
   [1471](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/base.py:1471)         prefer_skip_nested_validation or global_skip_validation
   [1472](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/base.py:1472)     )
   [1473](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/base.py:1473) ):
-> [1474](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/base.py:1474)     return fit_method(estimator, *args, **kwargs)

File [~/anaconda3/lib/python3.11/site-packages/sklearn/model_selection/_search.py:970](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/model_selection/_search.py:970), in BaseSearchCV.fit(self, X, y, **params)
    [964](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/model_selection/_search.py:964)     results = self._format_results(
    [965](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/model_selection/_search.py:965)         all_candidate_params, n_splits, all_out, all_more_results
    [966](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/model_selection/_search.py:966)     )
    [968](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/model_selection/_search.py:968)     return results
--> [970](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/model_selection/_search.py:970) self._run_search(evaluate_candidates)
    [972](https://file+.vscode-resource.vscode-cdn.net/home/kevindysart/Desktop/~/anaconda3/lib/python3.11/site-packages/sklearn/model_selection/_search.py:972) # multimetric is determined here because in the case of a callable
...
  File "/home/kevindysart/anaconda3/lib/python3.11/site-packages/imblearn/pipeline.py", line 292, in fit
    fit_params_steps = self._check_fit_params(**fit_params)
                       ^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'Pipeline' object has no attribute '_check_fit_params'
@glemaitre
Copy link
Member

What is your version of scikit-learn?

@glemaitre
Copy link
Member

I assume that a quick fix would be to upgrade scikit-learn. I'll check on my side if we have a compatibility issue with older versions.

@mahanrahimi
Copy link

Are there any updates on this issue? I have the same problem of getting AttributeError: 'Pipeline' object has no attribute '_check_fit_params' on Windows with:
python = 3.10.14
scikit-learn = 1.4.2
imbalanced-learn = 0.11.0
joblib = 1.4.0

@dysartk
Copy link
Author

dysartk commented May 13, 2024 via email

@mahanrahimi
Copy link

Kevin, could you please let me know which versions of python, scikit-learn, and imbalanced-learn you are using? I have tried different versions. Also, I have uninstalled and re-installed Anaconda. But the same error still shows up.

@glemaitre
Copy link
Member

imbalanced-learn = 0.11.0

@mahanrahimi you need to update to 0.12.2 for compatibility with scikit-learn 1.4.2

@mahanrahimi
Copy link

imbalanced-learn = 0.11.0

@mahanrahimi you need to update to 0.12.2 for compatibility with scikit-learn 1.4.2

Thank you so much, this resolved my problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants