Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve closure by returning more reasonable values #130

Open
pricebenjamin opened this issue Feb 9, 2019 · 10 comments
Open

Improve closure by returning more reasonable values #130

pricebenjamin opened this issue Feb 9, 2019 · 10 comments
Milestone

Comments

@pricebenjamin
Copy link
Contributor

From the introduction:

In gplearn, several protected functions are used:

  • division, if the denominator lies between -0.001 and 0.001, returns 1.0.
  • square root returns the square root of the absolute value of the argument.
  • log returns the logarithm of the absolute value of the argument, or for very small values less than 0.001, it returns 0.0.
  • inverse, if the argument lies between -0.001 and 0.001, returns 0.0.

Are these return values widely accepted?

If the denominator lies between -0.001 and 0.001, why should the node return 1.0? Alternatively,

  • Prune the node
  • If -0.001 < denominator < 0, set denominator equal to -0.001, and if 0 < denominator < 0.001, set denominator equal to 0.001
  • Construct a piece-wise function that assumes values of 1/x outside of the interval [-0.001, 0.001] and uses a line segment to join the two pieces (or use a more complicated, but differentiable, solution)

Similar approaches could be used for the logarithm.

@pricebenjamin
Copy link
Contributor Author

Or just add epsilon... 😄 Sometimes the easier solution doesn't immediately come to mind.

@hwulfmeyer
Copy link

0.001 is quite a high value for protected division.
Good work on noticing this issue.

@trevorstephens
Copy link
Owner

Are these return values widely accepted?

The value I used was a bit of hand wringer at the time. What's a small number? What's a big number? :-) In the end I chose a value that I felt wasn't going to explode the output values "too much" ... https://cswww.essex.ac.uk/staff/rpoli/gp-field-guide/B3SourceCode.html#24_3 included 0.001 for protected division and so I just adopted that. It felt reasonable at the time. Haven't been challenged on it until now, but figured that Poli, Langdon & McPhee wouldn't be spreading things that were not reasonable.

I'll carry on the discussion at #131

@danuker
Copy link

danuker commented Oct 3, 2019

I can't help but think... why not make a closure at the fitness evaluation?

  1. Numpy is perfectly capable of handling infinities. Wouldn't an infinite cost work as "this example is really bad and you should never use it"? If an infinite penalty would remove too much information about an individual, one could clip it with a +/- "maximum penalty" instead of infinity.
>>> 1 / np.zeros(1)
array([inf])
>>> -1 / np.zeros(1)
array([-inf])
  1. NaNs when the target is not a NaN should similarly return an infinite/maximum cost. Why wouldn't this work?
>>> y_pred = np.array([float('-inf'),  float('nan'), float('inf')])
>>> y = np.array([1, 2, 3])
>>> y_pred - y
array([-inf,  nan,  inf])
>>> err = (y_pred - y)
>>> np.where(np.isnan(err), np.inf, err)
array([-inf,  inf,  inf])

And if you want to clip:

>>> MAX_PENALTY=1000
>>> np.clip(np.where(np.isnan(err), np.inf, err), -MAX_PENALTY, MAX_PENALTY)
array([-1000.,  1000.,  1000.])

Infinitely punishing NaNs is presented here: Genetic programming as a model induction engine

Another approach (which is adopted here) is to use a special undefined value NaN that is returned by functions with illegal inputs. All functions that receive NaN in any of their inputs will also return NaN. Finally, the objective function will return the worst possible value when one of the inputs is NaN.

But I couldn't find any scientific references to the "MAX_PENALTY" I came up with.

Treating the closure at the fitness-function level would allow for much greater flexibility in the supported building block functions, and eliminate the effort to ensure each of their closures. In addition, it would eliminate the need for an arbitrary "safety" threshold.

@smheidrich
Copy link

@danuker Making nan yield infinite cost / worst possible fitness was my immediate thought as well, so I've been using custom fitness functions that do just that as a workaround:

import numpy as np

def _fitness(y, y_pred, w):
  # example for mean square error, replace line below as appropriate for other error types
  r = np.average(((y_pred - y) ** 2), weights=w)
  return np.nan_to_num(r, nan=float("inf"))

est = SymbolicRegressor(...
  metric=make_fitness(_fitness, greater_is_better=False),
)

So far it seems to work fine. Population average fitness is always inf, but it doesn't seem to matter for the actual performance of the algorithm.

So IMHO it would be great if this was available out-of-the-box via some keyword argument, e.g. nan_costs_inf=True/False.

@hwulfmeyer
Copy link

hwulfmeyer commented Oct 30, 2020

Evolutionary speaking it would make more sense to assign a penalty to those functions. Giving them the worst fitness is akin to simply deleting them from the population.

@danuker
Copy link

danuker commented Oct 30, 2020

@hwulfmeyer That is true. But removing numerically unstable individuals means the remaining ones will be stable. It is objectively much worse to be unstable, in a model.

We might lose diversity, but that can be addressed through other ways. For instance, Nutonian Eureqa preserves greater population diversity by not using just one fitness metric, but letting all Pareto-nondominated individuals survive, which are not dominated in both error AND complexity.

By yielding all of the potentially optimal solutions, a designer can make focused tradeoffs within this constrained set of parameters, rather than needing to consider the full ranges of parameters.

This is already requested in #33.

@hwulfmeyer
Copy link

hwulfmeyer commented Oct 30, 2020

@hwulfmeyer That is true. But removing numerically unstable individuals means the remaining ones will be stable. It is objectively much worse to be unstable, in a model.

If you assign a penalty on individuals which are illegal the evolutionary process will handle the stableness itself. Unstable individuals will have a less likely chance to reproduce and thus producing further unstable individuals. At the same time there is a small chance they reproduce, which provides us the chance to not loose all genetic data. Anyways, this discussion is unecessary I guess. I just wanted to point out that there are different approaches to handle illegal individuals beside assigning the worst fitness. One could also try to implement 'repairing', which is a technique known in genetic algorithms.

Pareto-nondominated genetic programming is a great approach but lacks in one part. Complexity is not something that necessarily needs to be optimized.

Anyways, there never is only one true solution to something, especially here. It will be good to have several different approaches.

@danuker
Copy link

danuker commented Oct 30, 2020

Complexity is not something that necessarily needs to be optimized.

Sure, it's not mandatory, but is a good second-metric to be good at, because all things being equal, you prefer the simpler explanation (Occam's Razor). In addition, there are some performance benefits (evaluating a complex function is more expensive). Edit: also, it acts sort of like regularization by preferring simpler models (and I suspect offers a slight overfitting reduction).

There never is only one true solution

I agree. It would be hard to find a method that is fastest at everything. Better implement multiple, and let people try it out. But I have used Eureqa, and it seems to work better than most GP logistic regression implementations, that is why I insisted.

@jmmcd
Copy link
Contributor

jmmcd commented Oct 13, 2021

Very interesting discussion!

I created a related issue (related to protected operators and closure, that is): #242.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants