Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the model_names don't show when the ensemble_config is setted #427

Open
ghk829 opened this issue May 30, 2019 · 4 comments
Open

the model_names don't show when the ensemble_config is setted #427

ghk829 opened this issue May 30, 2019 · 4 comments

Comments

@ghk829
Copy link

ghk829 commented May 30, 2019

I run this code

import os
os.environ['is_test_suite']="True" # this is writen due to bug for multiprocessing and pickling I issued. #426 
from auto_ml import Predictor
from auto_ml.utils import get_boston_dataset
from auto_ml.utils_models import load_ml_model

# Load data
df_train, df_test = get_boston_dataset()

# Tell auto_ml which column is 'output'
# Also note columns that aren't purely numerical
# Examples include ['nlp', 'date', 'categorical', 'ignore']
#
column_descriptions = {
  'CHAS': 'output'

}

ml_predictor = Predictor(type_of_estimator='classification', column_descriptions=column_descriptions)
ml_predictor.train(df_train,ensemble_config=[{"model_name":"GradientBoostingClassifier"},{"model_name":"RandomForestClassifier"}],perform_feature_selection=True)

the result only show that

ml_predictor.model_names
['GradientBoostingClassifier']
@loaiabdalslam
Copy link

pip install -r advanced_requirements.txt
Check if that tensorflow or keras are installed in your machine or anaconda env
because i ran your code and its run successfully :

Welcome to auto_ml! We're about to go through and make sense of your data using machine learning, and give you a production-ready pipeline to get predictions with.

If you have any issues, or new feature ideas, let us know at http://auto.ml
You are running on version 2.9.10
Now using the model training_params that you passed in:
{}
After overwriting our defaults with your values, here are the final params that will be used to initialize the model:
{'presort': False, 'learning_rate': 0.1, 'warm_start': True}
Running basic data cleaning
Fitting DataFrameVectorizer
Now using the model training_params that you passed in:
{}
After overwriting our defaults with your values, here are the final params that will be used to initialize the model:
{'presort': False, 'learning_rate': 0.1, 'warm_start': True}


********************************************************************************************
About to run GridSearchCV on the pipeline for several models to predict CHAS
Fitting 2 folds for each of 2 candidates, totalling 4 fits
[CV] _scorer=<auto_ml.utils_scoring.RegressionScorer object at 0x7f8174e77780>, model=GradientBoostingRegressor(alpha=0.9, criterion='friedman_mse', init=None,
                          learning_rate=0.1, loss='ls', max_depth=3,
                          max_features=None, max_leaf_nodes=None,
                          min_impurity_decrease=0.0, min_impurity_split=None,
                          min_samples_leaf=1, min_samples_split=2,
                          min_weight_fraction_leaf=0.0, n_estimators=100,
                          n_iter_no_change=None, presort=False,
                          random_state=None, subsample=1.0, tol=0.0001,
                          validation_fraction=0.1, verbose=0, warm_start=True) 
[1] random_holdout_set_from_training_data's score is: -0.244
[2] random_holdout_set_from_training_data's score is: -0.239
[3] random_holdout_set_from_training_data's score is: -0.239
[4] random_holdout_set_from_training_data's score is: -0.234
[5] random_holdout_set_from_training_data's score is: -0.237
[6] random_holdout_set_from_training_data's score is: -0.238
[7] random_holdout_set_from_training_data's score is: -0.237
[8] random_holdout_set_from_training_data's score is: -0.235
[9] random_holdout_set_from_training_data's score is: -0.233
[10] random_holdout_set_from_training_data's score is: -0.232
[11] random_holdout_set_from_training_data's score is: -0.232
[12] random_holdout_set_from_training_data's score is: -0.235
[13] random_holdout_set_from_training_data's score is: -0.236
[14] random_holdout_set_from_training_data's score is: -0.235
[15] random_holdout_set_from_training_data's score is: -0.239
[16] random_holdout_set_from_training_data's score is: -0.239
[17] random_holdout_set_from_training_data's score is: -0.239
[18] random_holdout_set_from_training_data's score is: -0.24
[19] random_holdout_set_from_training_data's score is: -0.241
[20] random_holdout_set_from_training_data's score is: -0.241
[21] random_holdout_set_from_training_data's score is: -0.241
[22] random_holdout_set_from_training_data's score is: -0.242
[23] random_holdout_set_from_training_data's score is: -0.241
[24] random_holdout_set_from_training_data's score is: -0.245
[25] random_holdout_set_from_training_data's score is: -0.245
[26] random_holdout_set_from_training_data's score is: -0.244
[27] random_holdout_set_from_training_data's score is: -0.245
[28] random_holdout_set_from_training_data's score is: -0.246
[29] random_holdout_set_from_training_data's score is: -0.245
[30] random_holdout_set_from_training_data's score is: -0.253
[31] random_holdout_set_from_training_data's score is: -0.254
The number of estimators that were the best for this training dataset: 11
The best score on the holdout set: -0.231888160976642
[CV]  _scorer=<auto_ml.utils_scoring.RegressionScorer object at 0x7f8174e77780>, model=GradientBoostingRegressor(alpha=0.9, criterion='friedman_mse', init=None,
                          learning_rate=0.1, loss='ls', max_depth=3,
                          max_features=None, max_leaf_nodes=None,
                          min_impurity_decrease=0.0, min_impurity_split=None,
                          min_samples_leaf=1, min_samples_split=2,
                          min_weight_fraction_leaf=0.0, n_estimators=100,
                          n_iter_no_change=None, presort=False,
                          random_state=None, subsample=1.0, tol=0.0001,
                          validation_fraction=0.1, verbose=0, warm_start=True), score=-0.256, total=   0.1s
[CV] _scorer=<auto_ml.utils_scoring.RegressionScorer object at 0x7f8174e77780>, model=GradientBoostingRegressor(alpha=0.9, criterion='friedman_mse', init=None,
                          learning_rate=0.1, loss='ls', max_depth=3,
                          max_features=None, max_leaf_nodes=None,
                          min_impurity_decrease=0.0, min_impurity_split=None,
                          min_samples_leaf=1, min_samples_split=2,
                          min_weight_fraction_leaf=0.0, n_estimators=31,
                          n_iter_no_change=None, presort=False,
                          random_state=None, subsample=1.0, tol=0.0001,
                          validation_fraction=0.1, verbose=0, warm_start=True) 
[1] random_holdout_set_from_training_data's score is: -0.229
[2] random_holdout_set_from_training_data's score is: -0.225
[3] random_holdout_set_from_training_data's score is: -0.216
[4] random_holdout_set_from_training_data's score is: -0.215
[5] random_holdout_set_from_training_data's score is: -0.211
[6] random_holdout_set_from_training_data's score is: -0.207
[7] random_holdout_set_from_training_data's score is: -0.206
[8] random_holdout_set_from_training_data's score is: -0.208
[9] random_holdout_set_from_training_data's score is: -0.201
[10] random_holdout_set_from_training_data's score is: -0.2
[11] random_holdout_set_from_training_data's score is: -0.2
[12] random_holdout_set_from_training_data's score is: -0.202
[13] random_holdout_set_from_training_data's score is: -0.197
[14] random_holdout_set_from_training_data's score is: -0.196
[15] random_holdout_set_from_training_data's score is: -0.197
[16] random_holdout_set_from_training_data's score is: -0.198
[17] random_holdout_set_from_training_data's score is: -0.198
[18] random_holdout_set_from_training_data's score is: -0.2
[19] random_holdout_set_from_training_data's score is: -0.201
[20] random_holdout_set_from_training_data's score is: -0.2
[21] random_holdout_set_from_training_data's score is: -0.205
[22] random_holdout_set_from_training_data's score is: -0.206
[23] random_holdout_set_from_training_data's score is: -0.206
[24] random_holdout_set_from_training_data's score is: -0.208
[25] random_holdout_set_from_training_data's score is: -0.205
[26] random_holdout_set_from_training_data's score is: -0.205
[27] random_holdout_set_from_training_data's score is: -0.202
[28] random_holdout_set_from_training_data's score is: -0.202
[29] random_holdout_set_from_training_data's score is: -0.202
[30] random_holdout_set_from_training_data's score is: -0.204
[31] random_holdout_set_from_training_data's score is: -0.205
[32] random_holdout_set_from_training_data's score is: -0.205
[33] random_holdout_set_from_training_data's score is: -0.205
[34] random_holdout_set_from_training_data's score is: -0.206
The number of estimators that were the best for this training dataset: 14
The best score on the holdout set: -0.19568170577598212
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.1s remaining:    0.0s
[CV]  _scorer=<auto_ml.utils_scoring.RegressionScorer object at 0x7f8174e77780>, model=GradientBoostingRegressor(alpha=0.9, criterion='friedman_mse', init=None,
                          learning_rate=0.1, loss='ls', max_depth=3,
                          max_features=None, max_leaf_nodes=None,
                          min_impurity_decrease=0.0, min_impurity_split=None,
                          min_samples_leaf=1, min_samples_split=2,
                          min_weight_fraction_leaf=0.0, n_estimators=31,
                          n_iter_no_change=None, presort=False,
                          random_state=None, subsample=1.0, tol=0.0001,
                          validation_fraction=0.1, verbose=0, warm_start=True), score=-0.248, total=   0.1s
[CV] _scorer=<auto_ml.utils_scoring.RegressionScorer object at 0x7f8174e77780>, model=RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=None,
                      max_features='auto', max_leaf_nodes=None,
                      min_impurity_decrease=0.0, min_impurity_split=None,
                      min_samples_leaf=1, min_samples_split=2,
                      min_weight_fraction_leaf=0.0, n_estimators=30, n_jobs=1,
                      oob_score=False, random_state=None, verbose=0,
                      warm_start=False) 
[CV]  _scorer=<auto_ml.utils_scoring.RegressionScorer object at 0x7f8174e77780>, model=RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=None,
                      max_features='auto', max_leaf_nodes=None,
                      min_impurity_decrease=0.0, min_impurity_split=None,
                      min_samples_leaf=1, min_samples_split=2,
                      min_weight_fraction_leaf=0.0, n_estimators=30, n_jobs=1,
                      oob_score=False, random_state=None, verbose=0,
                      warm_start=False), score=-0.236, total=   0.1s
[CV] _scorer=<auto_ml.utils_scoring.RegressionScorer object at 0x7f8174e77780>, model=RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=None,
                      max_features='auto', max_leaf_nodes=None,
                      min_impurity_decrease=0.0, min_impurity_split=None,
                      min_samples_leaf=1, min_samples_split=2,
                      min_weight_fraction_leaf=0.0, n_estimators=30, n_jobs=1,
                      oob_score=False, random_state=None, verbose=0,
                      warm_start=False) 
[CV]  _scorer=<auto_ml.utils_scoring.RegressionScorer object at 0x7f8174e77780>, model=RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=None,
                      max_features='auto', max_leaf_nodes=None,
                      min_impurity_decrease=0.0, min_impurity_split=None,
                      min_samples_leaf=1, min_samples_split=2,
                      min_weight_fraction_leaf=0.0, n_estimators=30, n_jobs=1,
                      oob_score=False, random_state=None, verbose=0,
                      warm_start=False), score=-0.262, total=   0.1s
[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    0.2s remaining:    0.0s
[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.3s remaining:    0.0s
[Parallel(n_jobs=1)]: Done   4 out of   4 | elapsed:    0.4s remaining:    0.0s
[Parallel(n_jobs=1)]: Done   4 out of   4 | elapsed:    0.4s finished
The best CV score from our hyperparameter search (by default averaging across k-fold CV) for CHAS is:
-0.24878969905471796
The best params were
{'model': 'RandomForestRegressor'}
Here are all the hyperparameters that were tried:
Score in the following columns always refers to cross-validation score
+--------------+
|   mean_score |
|--------------|
|      -0.2519 |
|      -0.2488 |
+--------------+
Calculating feature responses, for advanced analytics.
Here are our feature responses for the trained model
+----+----------------+---------+-------------------+-------------------+-----------+-----------+
|    | Feature Name   |   Delta |   FR_Decrementing |   FR_Incrementing |   FRD_MAD |   FRI_MAD |
|----+----------------+---------+-------------------+-------------------+-----------+-----------|
| 12 | AGE            | 13.9801 |            0.0050 |           -0.0002 |    0.0000 |    0.0000 |
| 11 | LSTAT          |  3.5508 |            0.0263 |            0.0011 |    0.0000 |    0.0000 |
| 10 | ZN             | 11.5619 |           -0.0005 |            0.0012 |    0.0000 |    0.0000 |
|  9 | DIS            |  1.0643 |            0.1068 |            0.0017 |    0.0000 |    0.0000 |
|  8 | B              | 45.7266 |           -0.0071 |            0.0026 |    0.0000 |    0.0000 |
|  7 | RM             |  0.3543 |            0.0022 |            0.0073 |    0.0000 |    0.0000 |
|  6 | TAX            | 82.9834 |            0.0111 |           -0.0075 |    0.0000 |    0.0000 |
|  5 | PTRATIO        |  1.1130 |            0.0111 |           -0.0081 |    0.0000 |    0.0000 |
|  4 | MEDV           |  4.6603 |           -0.0050 |            0.0104 |    0.0000 |    0.0000 |
|  3 | CRIM           |  4.4320 |            0.0123 |            0.0221 |    0.0000 |    0.0000 |
|  2 | INDUS          |  3.4430 |            0.0060 |            0.0229 |    0.0000 |    0.0000 |
|  1 | RAD            |  4.2895 |           -0.0021 |            0.0350 |    0.0000 |    0.0000 |
|  0 | NOX            |  0.0588 |           -0.0099 |            0.0545 |    0.0000 |    0.0000 |
+----+----------------+---------+-------------------+-------------------+-----------+-----------+
<auto_ml.predictor.Predictor at 0x7f81a83eb320>

@ghk829
Copy link
Author

ghk829 commented Nov 18, 2019

@loaiabdalslam
thank you for checking the issue.
What parameters did you set for running process?
The same as mine ?

@loaiabdalslam
Copy link

yea
just create a new virtual env for python and try installing everything on it

@Aun0124
Copy link

Aun0124 commented Apr 28, 2021

how to put a list of models and let it choose the best in the latest update in the latest documentations?
and i didnt get the results of recommend the best model for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants