Significance Analysis

This package is used to analyse datasets of different HPO-algorithms performing on multiple benchmarks.

Note

As indicated with the v0.x.x version number, Significance Analysis is early stage code and APIs might change in the future.

Documentation

Please have a look at our example. The dataset should have the following format:

system_id (algorithm name)	input_id (benchmark name)	metric (mean/estimate)	optional: bin_id (budget/traininground)
Algorithm1	Benchmark1	x.xxx	1
Algorithm1	Benchmark1	x.xxx	2
Algorithm1	Benchmark2	x.xxx	1
...	...	...	...
Algorithm2	Benchmark2	x..xxx	2

In this dataset, there are two different algorithms, trained on two benchmarks for two iterations each. The variable-names (system_id, input_id...) can be customized, but have to be consistent throughout the dataset, i.e. not "mean" for one benchmark and "estimate" for another. The conduct_analysis function is then called with the dataset and the variable-names as parameters. Optionally the dataset can be binned according to a fourth variable (bin_id) and the analysis is conducted on each of the bins seperately, as shown in the code example above. To do this, provide the name of the bin_id-variable and if wanted the exact bins and bin labels. Otherwise a bin for each unique value will be created.

Installation

Using R, >=4.0.0 install packages: Matrix, emmeans, lmerTest and lme4

Using pip

pip install significance-analysis

Usage

Generate data from HPO-algorithms on benchmarks, saving data according to our format.
Call function conduct_analysis on dataset, while specifying variable-names

In code, the usage pattern can look like this:

import pandas as pd
from signficance_analysis import conduct_analysis

# 1. Generate/import dataset
data = pd.read_csv("./significance_analysis_example/exampleDataset.csv")

# 2. Analyse dataset
conduct_analysis(data, "mean", "acquisition", "benchmark")

For more details and features please have a look at our example.

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
significance_analysis_example		significance_analysis_example
src/significance_analysis		src/significance_analysis
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly