Can't specify list-valued hyperparameters in AutoMLSearch #2015

freddyaboulton · 2021-03-23T14:30:09Z

Repro:

from evalml.demos import load_breast_cancer
from evalml.pipelines import BinaryClassificationPipeline
from evalml.automl import AutoMLSearch

class PipeLine(BinaryClassificationPipeline):
    component_graph = ["Drop Columns Transformer", "Random Forest Classifier"]
    
X , y = load_breast_cancer()

automl = AutoMLSearch(X, y, problem_type="binary", allowed_pipelines=[PipeLine],
                      pipeline_parameters={"Drop Columns Transformer": {"columns": ["mean texture"]}})
automl.search()

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/sources/evalml/evalml/pipelines/component_graph.py in instantiate(self, parameters)
     77             try:
---> 78                 new_component = component_class(**component_parameters, random_seed=self.random_seed)
     79             except (ValueError, TypeError) as e:

~/sources/evalml/evalml/pipelines/components/transformers/column_selectors.py in __init__(self, columns, random_seed, **kwargs)
     15         if columns and not isinstance(columns, list):
---> 16             raise ValueError(f"Parameter columns must be a list. Received {type(columns)}.")
     17 

ValueError: Parameter columns must be a list. Received <class 'str'>.

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
<ipython-input-21-b4819258a317> in <module>
     10 automl = AutoMLSearch(X, y, problem_type="binary", allowed_pipelines=[PipeLine],
     11                       pipeline_parameters={"Drop Columns Transformer": {"columns": ["mean texture"]}})
---> 12 automl.search()

~/sources/evalml/evalml/automl/automl_search.py in search(self, show_iteration_plot)
    490         logger.info("Allowed model families: %s\n" % ", ".join([model.value for model in self.allowed_model_families]))
    491         self.search_iteration_plot = None
--> 492         if self.plot:
    493             self.search_iteration_plot = self.plot.search_iteration_plot(interactive_plot=show_iteration_plot)
    494 

~/sources/evalml/evalml/automl/automl_algorithm/iterative_algorithm.py in next_batch(self)
     63         next_batch = []
     64         if self._batch_number == 0:
---> 65             next_batch = [pipeline_class(parameters=self._transform_parameters(pipeline_class, {}), random_seed=self.random_seed)
     66                           for pipeline_class in self.allowed_pipelines]
     67 

~/sources/evalml/evalml/automl/automl_algorithm/iterative_algorithm.py in <listcomp>(.0)
     63         next_batch = []
     64         if self._batch_number == 0:
---> 65             next_batch = [pipeline_class(parameters=self._transform_parameters(pipeline_class, {}), random_seed=self.random_seed)
     66                           for pipeline_class in self.allowed_pipelines]
     67 

~/sources/evalml/evalml/pipelines/classification_pipeline.py in __init__(self, parameters, random_seed)
     23         """
     24         self._encoder = LabelEncoder()
---> 25         super().__init__(parameters, random_seed=random_seed)
     26 
     27     def fit(self, X, y):

~/sources/evalml/evalml/pipelines/pipeline_base.py in __init__(self, parameters, random_seed)
     77         else:
     78             self._component_graph = ComponentGraph(component_dict=self.component_graph, random_seed=self.random_seed)
---> 79         self._component_graph.instantiate(parameters)
     80 
     81         self.input_feature_names = {}

~/sources/evalml/evalml/pipelines/component_graph.py in instantiate(self, parameters)
     80                 self._is_instantiated = False
     81                 err = "Error received when instantiating component {} with the following arguments {}".format(component_name, component_parameters)
---> 82                 raise ValueError(err) from e
     83 
     84             component_instances[component_name] = new_component

ValueError: Error received when instantiating component Drop Columns Transformer with the following arguments {'columns': 'mean texture'}

The IterativeAlgorithm selects the first element of the columns list which is not the intended behavior.

The text was updated successfully, but these errors were encountered:

angela97lin · 2021-03-23T17:38:53Z

This issue arises when IterativeAlgorithm calls _transform_parameters and tries to unpack the parameters. This code was added to address when the user passes in pipeline_parameters to freeze or set the hyperparameters to a particular subset. For example:

    params = {'Imputer': {'numeric_impute_strategy': ['median', 'most_frequent']},
              'Decision Tree Regressor': {'max_depth': [17, 18, 19], 'max_features': Categorical(['auto'])},
              'Elastic Net Regressor': {"alpha": Real(0, 0.5), "l1_ratio": (0.01, 0.02, 0.03)}}
    automl = AutoMLSearch(X_train=X, y_train=y, problem_type='regression', pipeline_parameters=params, n_jobs=1)
    automl.search()

In the first batch in _transform_parameters, to handle list inputs such as max_depth or numeric_impute_strategy above, we simply choose or sample the first element in the list.

One way around this issue is thus to remove this line and enforce that lists aren't allowed.

@dsherry @freddyaboulton @bchen1116 @chukarsten FYI :)

jeremyliweishih · 2021-03-23T19:15:03Z

@dsherry @chukarsten

In #1862 the plan is to add a Drop Columns Transformer in _get_preprocessing_components when an index column exists and then add those columns to self. pipeline_parameters as well so this issue will blocking that as well.

freddyaboulton added the bug Issues tracking problems with existing features. label Mar 23, 2021

freddyaboulton mentioned this issue Mar 23, 2021

Update AutoMLSearch with actions parameter #1981

Closed

jeremyliweishih mentioned this issue Mar 23, 2021

Automatically drop features tagged as "primary key" #1862

Closed

angela97lin self-assigned this Mar 24, 2021

angela97lin mentioned this issue Mar 24, 2021

Remove lists as acceptable hyperparameter ranges in AutoMLSearch #2028

Merged

angela97lin closed this as completed in #2028 Mar 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't specify list-valued hyperparameters in AutoMLSearch #2015

Can't specify list-valued hyperparameters in AutoMLSearch #2015

freddyaboulton commented Mar 23, 2021

angela97lin commented Mar 23, 2021 •

edited

jeremyliweishih commented Mar 23, 2021

Can't specify list-valued hyperparameters in AutoMLSearch #2015

Can't specify list-valued hyperparameters in AutoMLSearch #2015

Comments

freddyaboulton commented Mar 23, 2021

angela97lin commented Mar 23, 2021 • edited

jeremyliweishih commented Mar 23, 2021

angela97lin commented Mar 23, 2021 •

edited