Two new y-transformation approaches #611

mlindauer · 2020-03-04T14:36:12Z

bilog (log transformations above 0 and below 0)
Gaussian Copula (ECDF -> quantiles -> Inverse Gaussian CDF)

dengdifan · 2021-07-16T11:01:24Z

If everyone is happy with the implementation, I will merge this branch

mfeurer

I'm not sure if we want to merge this PR at the moment:

We don't have a method that uses quantile transformations
I think the quantile transformation should be improved
We don't have a method that uses bilog transformations at the moment

mfeurer · 2021-07-16T12:54:49Z

smac/runhistory/runhistory2epm.py

+        np.ndarray
+        """
+        # ECDF
+        quants = [sp.stats.percentileofscore(values, v)/100 - VERY_SMALL_NUMBER for v in values]


I believe this is incorrect. I reimplemented this according to Salinas et al., which appears to give better, and most importantly, symmetric outputs:

import numpy as np import scipy.stats values = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) VERY_SMALL_NUMBER = 1e-10 # This PR quants = [scipy.stats.percentileofscore(values, v)/100 - VERY_SMALL_NUMBER for v in values] output = np.array([scipy.stats.norm.ppf(q) for q in quants]).reshape((-1, 1)) print(output) # Correct quants = (scipy.stats.rankdata(values.flatten()) - 1) / (len(values) - 1) cutoff = 1 / (4 * np.power(len(values), 0.25) * np.sqrt(np.pi * np.log(len(values)))) quants = np.clip(quants, a_min=cutoff, a_max=1 - cutoff) # Inverse Gaussian CDF rval = np.array([scipy.stats.norm.ppf(q) for q in quants]).reshape((-1, 1)) print(rval)

output:

[-1.28155157e+00 -8.41621234e-01 -5.24400513e-01 -2.53347103e-01 -2.50662848e-10 2.53347103e-01 5.24400512e-01 8.41621233e-01 1.28155156e+00 6.36134089e+00] [-1.62322583 -1.22064035 -0.76470967 -0.4307273 -0.1397103 0.1397103 0.4307273 0.76470967 1.22064035 1.62322583]

alexandertornede · 2023-01-26T12:24:52Z

We will have a look at how these methods perform once we have the new benchmarking fully in place.

mfeurer · 2023-01-27T08:43:59Z

The recent HEBO suggests using a PowerTransform from scikit-learn. If you plan to benchmark these two, could you also throw this one in the mix?

alexandertornede · 2023-02-01T12:30:51Z

Thanks for the pointer. Sure!

mlindauer added 2 commits March 4, 2020 15:31

ADD bilog and GaussianCopula for y-transformation

3e9b682

MAINT meta header

3fe27f8

mlindauer requested a review from KEggensperger March 4, 2020 14:36

dengdifan requested a review from mfeurer July 16, 2021 11:00

mfeurer requested changes Jul 16, 2021

View reviewed changes

stale bot added the stale label Jun 17, 2022

renesass added feature and removed stale labels Jun 23, 2022

automl deleted a comment from stale bot Jun 23, 2022

alexandertornede self-assigned this Jan 26, 2023

benjamc added this to the v2.1 milestone Feb 16, 2023

alexandertornede mentioned this pull request Apr 11, 2023

Implement y-transformation approaches #966

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Two new y-transformation approaches #611

Two new y-transformation approaches #611

mlindauer commented Mar 4, 2020

dengdifan commented Jul 16, 2021

mfeurer left a comment

mfeurer Jul 16, 2021

alexandertornede commented Jan 26, 2023

mfeurer commented Jan 27, 2023

alexandertornede commented Feb 1, 2023

Two new y-transformation approaches #611

Are you sure you want to change the base?

Two new y-transformation approaches #611

Conversation

mlindauer commented Mar 4, 2020

dengdifan commented Jul 16, 2021

mfeurer left a comment

Choose a reason for hiding this comment

mfeurer Jul 16, 2021

Choose a reason for hiding this comment

alexandertornede commented Jan 26, 2023

mfeurer commented Jan 27, 2023

alexandertornede commented Feb 1, 2023