Improve closure by returning more reasonable values #130

pricebenjamin · 2019-02-09T00:14:44Z

From the introduction:

In gplearn, several protected functions are used:

division, if the denominator lies between -0.001 and 0.001, returns 1.0.

square root returns the square root of the absolute value of the argument.

log returns the logarithm of the absolute value of the argument, or for very small values less than 0.001, it returns 0.0.

inverse, if the argument lies between -0.001 and 0.001, returns 0.0.

Are these return values widely accepted?

If the denominator lies between -0.001 and 0.001, why should the node return 1.0? Alternatively,

Prune the node
If -0.001 < denominator < 0, set denominator equal to -0.001, and if 0 < denominator < 0.001, set denominator equal to 0.001
Construct a piece-wise function that assumes values of 1/x outside of the interval [-0.001, 0.001] and uses a line segment to join the two pieces (or use a more complicated, but differentiable, solution)

Similar approaches could be used for the logarithm.

The text was updated successfully, but these errors were encountered:

pricebenjamin · 2019-02-10T01:28:28Z

Or just add epsilon... 😄 Sometimes the easier solution doesn't immediately come to mind.

hwulfmeyer · 2019-02-10T19:18:20Z

0.001 is quite a high value for protected division.
Good work on noticing this issue.

trevorstephens · 2019-02-17T10:30:20Z

Are these return values widely accepted?

The value I used was a bit of hand wringer at the time. What's a small number? What's a big number? :-) In the end I chose a value that I felt wasn't going to explode the output values "too much" ... https://cswww.essex.ac.uk/staff/rpoli/gp-field-guide/B3SourceCode.html#24_3 included 0.001 for protected division and so I just adopted that. It felt reasonable at the time. Haven't been challenged on it until now, but figured that Poli, Langdon & McPhee wouldn't be spreading things that were not reasonable.

I'll carry on the discussion at #131

danuker · 2019-10-03T08:44:18Z

I can't help but think... why not make a closure at the fitness evaluation?

Numpy is perfectly capable of handling infinities. Wouldn't an infinite cost work as "this example is really bad and you should never use it"? If an infinite penalty would remove too much information about an individual, one could clip it with a +/- "maximum penalty" instead of infinity.

>>> 1 / np.zeros(1)
array([inf])
>>> -1 / np.zeros(1)
array([-inf])

NaNs when the target is not a NaN should similarly return an infinite/maximum cost. Why wouldn't this work?

>>> y_pred = np.array([float('-inf'),  float('nan'), float('inf')])
>>> y = np.array([1, 2, 3])
>>> y_pred - y
array([-inf,  nan,  inf])
>>> err = (y_pred - y)
>>> np.where(np.isnan(err), np.inf, err)
array([-inf,  inf,  inf])

And if you want to clip:

>>> MAX_PENALTY=1000
>>> np.clip(np.where(np.isnan(err), np.inf, err), -MAX_PENALTY, MAX_PENALTY)
array([-1000.,  1000.,  1000.])

Infinitely punishing NaNs is presented here: Genetic programming as a model induction engine

Another approach (which is adopted here) is to use a special undefined value NaN that is returned by functions with illegal inputs. All functions that receive NaN in any of their inputs will also return NaN. Finally, the objective function will return the worst possible value when one of the inputs is NaN.

But I couldn't find any scientific references to the "MAX_PENALTY" I came up with.

Treating the closure at the fitness-function level would allow for much greater flexibility in the supported building block functions, and eliminate the effort to ensure each of their closures. In addition, it would eliminate the need for an arbitrary "safety" threshold.

smheidrich · 2020-10-29T17:42:06Z

@danuker Making nan yield infinite cost / worst possible fitness was my immediate thought as well, so I've been using custom fitness functions that do just that as a workaround:

import numpy as np

def _fitness(y, y_pred, w):
  # example for mean square error, replace line below as appropriate for other error types
  r = np.average(((y_pred - y) ** 2), weights=w)
  return np.nan_to_num(r, nan=float("inf"))

est = SymbolicRegressor(...
  metric=make_fitness(_fitness, greater_is_better=False),
)

So far it seems to work fine. Population average fitness is always inf, but it doesn't seem to matter for the actual performance of the algorithm.

So IMHO it would be great if this was available out-of-the-box via some keyword argument, e.g. nan_costs_inf=True/False.

hwulfmeyer · 2020-10-30T10:34:44Z

Evolutionary speaking it would make more sense to assign a penalty to those functions. Giving them the worst fitness is akin to simply deleting them from the population.

danuker · 2020-10-30T10:51:59Z

@hwulfmeyer That is true. But removing numerically unstable individuals means the remaining ones will be stable. It is objectively much worse to be unstable, in a model.

We might lose diversity, but that can be addressed through other ways. For instance, Nutonian Eureqa preserves greater population diversity by not using just one fitness metric, but letting all Pareto-nondominated individuals survive, which are not dominated in both error AND complexity.

By yielding all of the potentially optimal solutions, a designer can make focused tradeoffs within this constrained set of parameters, rather than needing to consider the full ranges of parameters.

This is already requested in #33.

hwulfmeyer · 2020-10-30T13:15:39Z

@hwulfmeyer That is true. But removing numerically unstable individuals means the remaining ones will be stable. It is objectively much worse to be unstable, in a model.

If you assign a penalty on individuals which are illegal the evolutionary process will handle the stableness itself. Unstable individuals will have a less likely chance to reproduce and thus producing further unstable individuals. At the same time there is a small chance they reproduce, which provides us the chance to not loose all genetic data. Anyways, this discussion is unecessary I guess. I just wanted to point out that there are different approaches to handle illegal individuals beside assigning the worst fitness. One could also try to implement 'repairing', which is a technique known in genetic algorithms.

Pareto-nondominated genetic programming is a great approach but lacks in one part. Complexity is not something that necessarily needs to be optimized.

Anyways, there never is only one true solution to something, especially here. It will be good to have several different approaches.

danuker · 2020-10-30T16:21:32Z

Complexity is not something that necessarily needs to be optimized.

Sure, it's not mandatory, but is a good second-metric to be good at, because all things being equal, you prefer the simpler explanation (Occam's Razor). In addition, there are some performance benefits (evaluating a complex function is more expensive). Edit: also, it acts sort of like regularization by preferring simpler models (and I suspect offers a slight overfitting reduction).

There never is only one true solution

I agree. It would be hard to find a method that is fastest at everything. Better implement multiple, and let people try it out. But I have used Eureqa, and it seems to work better than most GP logistic regression implementations, that is why I insisted.

jmmcd · 2021-10-13T21:03:59Z

Very interesting discussion!

I created a related issue (related to protected operators and closure, that is): #242.

pricebenjamin mentioned this issue Feb 10, 2019

New closure methods #131

Open

trevorstephens added the question label May 15, 2019

hwulfmeyer mentioned this issue Nov 10, 2021

Add Analytic Quotient and another protected log #242

Open

trevorstephens added this to the 0.5.0 milestone Jun 20, 2022

trevorstephens added enhancement and removed question labels Jun 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve closure by returning more reasonable values #130

Improve closure by returning more reasonable values #130

pricebenjamin commented Feb 9, 2019

pricebenjamin commented Feb 10, 2019

hwulfmeyer commented Feb 10, 2019

trevorstephens commented Feb 17, 2019

danuker commented Oct 3, 2019 •

edited

smheidrich commented Oct 29, 2020

hwulfmeyer commented Oct 30, 2020 •

edited

danuker commented Oct 30, 2020 •

edited

hwulfmeyer commented Oct 30, 2020 •

edited

danuker commented Oct 30, 2020 •

edited

jmmcd commented Oct 13, 2021

Improve closure by returning more reasonable values #130

Improve closure by returning more reasonable values #130

Comments

pricebenjamin commented Feb 9, 2019

pricebenjamin commented Feb 10, 2019

hwulfmeyer commented Feb 10, 2019

trevorstephens commented Feb 17, 2019

danuker commented Oct 3, 2019 • edited

smheidrich commented Oct 29, 2020

hwulfmeyer commented Oct 30, 2020 • edited

danuker commented Oct 30, 2020 • edited

hwulfmeyer commented Oct 30, 2020 • edited

danuker commented Oct 30, 2020 • edited

jmmcd commented Oct 13, 2021

danuker commented Oct 3, 2019 •

edited

hwulfmeyer commented Oct 30, 2020 •

edited

danuker commented Oct 30, 2020 •

edited

hwulfmeyer commented Oct 30, 2020 •

edited

danuker commented Oct 30, 2020 •

edited