This is a Cookiecutter template for creating a Orion Algorithm Plugin.
Install Python requirements to use the template:
$ pip install -r requirements.txt
Create a new project directly from the template on GitHub:
$ cookiecutter gh:Epistimio/cookiecutter-orion.algo
plugin_name []: skopt
author_name []: Xavier Bouthillier
author_short [Author Name]:
author_email []: xavier.bouthillier@umontreal.ca
github_username []: bouthilx
copyright [2021, Author Name]:
short_description [TODO]:
algo_name []: BayesianOptimizer
algo_module_name [bayesianoptimizer]: bayes
Field | Description | |
plugin_name |
Will be used for orion.algo.plugin_name | |
author_name |
For metadata of python package | |
author_short |
For metadata of python package | |
author_email |
For metadata of python package | |
github_username |
Username to build the url for installation | |
copyright |
For the BSD-3 license (You can change the license) | |
short_description |
For metadata of python package | |
algo_name |
Name for the algorithm class | |
algo_module_name |
Name of the algorihtm module |
This will create the following package structure.
orion.algo.{plugin_name}
├── README.rst
├── setup.cfg
├── setup.py
├── MANIFEST.in
├── LICENSE (BSD License)
├── versioneer.py
├── tox.ini
├── dev-requirements.txt
├── doc
│ ├── requirements.txt
│ └── src
│ ├── conf.py
│ └── index.rst
├── tests
│ ├── requirements.txt
│ ├── integration_test.py
│ └── benchmark.py
└── src
└── orion
└── algo
└── {plugin_name}
├── {algoname}.py
├── __init__.py
└── _version.py
The important files to modify are src/orion/algo/{plugin_name}/{module_name}.py
to implement the algorithm and tests/benchmark/{algo_name}.yaml
to fill the arguments required for the algorithm you implement.
LICENSE
Note that you are free to change the License, copyright is to your name.
versioneer.py
src/orion/algo/{plugin_name}/_version.py
This serves to version automatically your algo, just ignore these if you don't plan to make releases.
tests
These are the automatically generated tests.
tests/benchmark
Automatically generated benchmark test using the yaml files created in the same folder.
doc
Automatically generated template for documentation
tox.ini
Tox file defining commands to run tests, build doc and publish code.
Tip |
The cookiecutter creates a fully functionning random search plugin. All tests should pass right away. Make sure your environment is setup properly and that all tests are passing as expected before making any changes in the code. |
class YourAlgorithm(BaseAlgorithm):
requires_type = "real"
requires_dist = "linear"
requires_shape = "flattened"
Some algorithms have special requirements on the data. Maybe the search space should be real only, logarithmic distributions should be linearized, or multidimensional data should be flattened. The algorithm class can specify these special requirements through the class attributes requires_type
, requires_dist
and requires_shape
. See documentation of build_required_space to see the special requirements supported.
When Oríon will instantiate an algorithm, it will wrap it in a SpaceTransformAlgoWrapper and create a transformed space based on the requirements. The SpaceTransformAlgoWrapper
will take care of converting the hyperparameters so that they are compliant with the search space outsite the algorithm and compliant with the algorithm's requirements when passed to the algorithm.
Suppose the algorithm has requires_type = "real"
, then Oríon will assign a transformed space to the algorithm in which all dimensions are transformed to be real. For instance, categorical dimensions are turned to one-hot vectors. The algorithm can then work in this real space solely. When the algorithm suggest trials, the SpaceTransformAlgoWrapper will convert the hyperparameters of these trials so that they are valid in the original space. For instance, one-hot vectors are converted back to categorical values.
@property
def space(self):
"""Domain of problem associated with this algorithm's instance."""
return self._space
@space.setter
def space(self, space):
"""Set space."""
self._space = space
At creation time, the algorithm is passed the original space as argument space
to its constructor __init__
. The algorithm is then assigned the transformed space through the setter algorithm.space = transformed_space
. This is the right place to keep a reference to the original space if you need to. See skopt's implementation for an example. Keeping a reference to the original space can be useful for instance to verify if an algorithm has tried every possible values in the original space. See is_done()
implementation of skopt's implementation for an example using the original space.
Tests that should pass at this point | Description |
test_get_id |
Test that id of trial is computed properly. |
test_configuration |
Test that configuration contains all expected attributes. |
def __init__(self, space, seed=None, some='other', custom='arguments'):
super(YourAlgorithm, self).__init__(space, seed=seed, some=some, custom=custom)
The initialization of the algorithm must pass space and seed to super().__init__
, but must also pass any other argument that should be part of the configuration of the algorithm. Any argument passed to super()
will be assigned as an attribute to the algorithm and will be included in algo.configuration
, which is used to save the configuration of the algorithm in storage.
def suggest(self, num):
The method to suggest new trials. The argument num
request the number of trials that the algorithm must sample. The algorithm may return less than num
if it is impossible to return more, and even return an empty list when it is not possible to suggest a new trial. This may happen for example if there are too many trials still being executed.
The algorithm should verify with self.has_suggested(trial)
if a trial was already suggested to make sure trials returned by suggest()
are new ones. Before returning a new trial, make sure to register it with self.register(trial)
so that self.has_suggested(trial)
works properly.
Tip |
Utility functions in format_trials are convenient to convert sets of hyperparameters from dictionary format or list of values to Trial objects. |
Tests that should pass at this point | Description |
test_suggest_n |
Test that the algorithm is able to suggest n trials. |
test_has_suggested |
Test that trials suggested are properly registered |
test_n_suggested |
Test that trials suggested are properly registered and counted. |
def observe(self, trials):
The method to observe new status or results of trials. Observe may be called multiple time for the same trials until they are completed or broken. See Trial's documentation for more information on their status or results attributes. An observed trial should be registered using self.register(trial)
so that self.has_observed(trial)
can work properly. Note that has_observed(trial)
may return true only if a trial is completed. It will return False if a trial is broken.
All trials registered with self.register(trial)
are stored in a dictionary self._trials_info
in the format {self.get_id(trial): (trial, trial.objective.value if trial.objective else None)}
.
Warning |
|
Tests that should pass at this point | Description |
test_observe |
Test that algorithm can observe without any expection. The validity of observe is not verified here. |
test_has_observed |
Test that observed completed trials are registered properly. |
test_n_observed |
Test that observed completed trials are registered and counted properly. |
test_real_data |
Test that algorithm can work with float values. |
test_int_data |
Test that algorithm can suggest and observe discrete values |
test_cat_data |
Test that algorithm can suggest and observe categorical values |
test_logint_data |
Test that algorithm can work with discrete loguniform dimensions. |
test_logreal_data |
Test that algorithm can work with real loguniform dimensions. |
test_shape_data |
Test that algorithm can work with arrays. |
def seed_rng(self, seed=None):
This method must seed the internal state of the algorithm so that it would always sample the same sequence of points.
You may need to seed global random number generators such as random
or numpy.random
if you are wrapping a third party library using them.
Tests that should pass at this point | Description |
test_seed_rng |
Test that seeding the algorithm can be seeded properly. |
test_seed_rng_init |
Test that the algorithm is seeded during instantiation. |
@property
def state_dict(self):
The state dict is used to store the algorithm's state in storage so that it can be shared across runners (the main processes running the optimization). Make sure to include any attribute that may change during the execution of the algorithm, including the state of random number generator used. Return copies of these attributes to avoid any side-effects. There is no need to include the values of the algorithm's configuration.
def set_state(self, state_dict):
Stateful attributes of the algorithm are reset using the given state_dict
. Note that set_state
must be compliant with state_dict
and use the same structure.
Tests that should pass at this point | Description |
test_state_dict |
Test that state_dict can be used to resume algorithm and sample the same trials deterministically. |
test_has_observed_statedict |
Test that state_dict sets back observed trials. |
test_has_suggested_statedict |
Test that state_dict sets back suggested trials. |
def is_done(self, points, results):
This method is used to signal whether the algorithm is done or not. The base implementation verifies if all possible values have been tried or whether more than self.max_trials
trials have been tried. self.max_trials
is an attribute set by Oríon based on the experiments max_trials
value, you should not add it to your algorithm's configuration.
You will need to overwrite this method if your algorithm use a transformed space, because transformed space can have a larger domain than the original one (ex: working with real values instead integers). The original space is the one that matters for the experiment and it should thus be the space used to infer whether all possible values have been tried. See skopt's implementation for an example of a custom is_done
implementation to handle transformed space.
Tests that should pass at this point | Description |
test_is_done_cardinality |
Test that algorithms is done when all possible values has been tried. |
test_is_done_max_trials |
Test that algorithm is done when maximum number of trials have been tried. |
test_optimize_branin |
Test that algorithm reaches a reasonable objective when optimizing branin. |
The base algorithm class implements several other methods that rarely requires to be overwritten but can prove very useful. Among In particular, you should pay attention to get_id(trial), fidelity_index, register(trial), has_suggested(trial), and has_observed(trial).
To test the freshly built package, you must first install the requirements. From within the new package, run
$ pip install -r dev-requirements.txt
$ pip install -r tests/requirements.txt
You can then run the unit-tests with
$ pytest tests/integration_test.py
or using tox
,
$ tox -e py36
Note that the algorithm pre-built is random search so that you can start from a fully working environment and test your way through the modifications.
There is also the option of running the toy-benchmark to compare the performance of your algorithm with random search and TPE.
$ python tests/benchmark.py
The benchmark will generate interactive plotly figures saved as Branin_AverageResult.html
and RosenBrock_AverageResult.html
.
Finally, official plugins must follow the same code quality standards than orion.core
. Therefore there are tests included in the pre-built package for black
, isort
and pylint
. You can execute them with
$ tox -e black
To fix black
or isort
issues, you can use the tox commands run-black
or run-isort
. For example:
$ tox -e run-black