Constraint the GP fit to be strictly positive #106

maria-vincenzi · 2018-11-24T09:06:24Z

Hi,
I have a question (sorry if this is not strictly related to the code implementation itself).
Is there any way to constraint the GP fit to be strictly positive?
Let's say I'm trying to fit some physical quantity that have to be positive (e.g. Flux).
Is there a way to prevent the GP to go below zero?
The only solution I came up with was to perform the GP fit in LOG space but this is not performing very well (I get smooth fits in LOG space, but when I then transform everything back to LIN the fit look a little bit "messy").
Thank you,
Maria

dfm · 2018-11-24T14:43:50Z

Unfortunately the only solution that I know of is the one you've already tried. There isn't anything else that you can do and still retain the analytic properties of a gaussian process. Depending on what you want and what you mean by "messy", you might be able to use a different kernel that does what you want, but I don't have any specific suggestions.

On Sat, Nov 24, 2018 at 4:06 AM Maria Vincenzi ***@***.***> wrote: Hi, I have a question (sorry if this is not strictly related to the code implementation itself). Is there any way to constraint the GP fit to be strictly positive? Let's say I'm trying to fit some physical quantity that have to be positive (e.g. Flux). Is there a way to prevent the GP to go below zero? The only solution I came up with was to perform the GP fit in LOG space but this is not performing very well (I get smooth fits in LOG space, but when I then transform everything back to LIN the fit look a little bit "messy"). Thank you, Maria — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_dfm_george_issues_106&d=DwMCaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=AvKCBU-zyph1qdf6NaatMQ&m=mSH1PlRHIyo0bTk9FtasGMDfx1GaPvPPWUfhzaeyvhE&s=tXirdET1gc1LsSaIrGmlxLIfoAN1iksXjjhMWvNowlY&e=>, or mute the thread <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AAVYSlVURqg6bbXMDsUSJxH2hhfegy7Aks5uyQwQgaJpZM4YxVtx&d=DwMCaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=AvKCBU-zyph1qdf6NaatMQ&m=mSH1PlRHIyo0bTk9FtasGMDfx1GaPvPPWUfhzaeyvhE&s=O6CbUIMWDjFe_bvq0zAZkAQe4qlKbGCa7GJ4sxqVi7E&e=> .

-- Dan Foreman-Mackey Associate Research Scientist Flatiron Institute http://dfm.io

davidwhogg · 2018-11-25T00:48:51Z

It feels like a constrained-positive solution is possible -- because it is for any finite-parameter model. But I don't know how to do it!

tmcclintock · 2019-11-21T17:40:21Z

I just wanted to chime in that this is possible by transforming the Gaussian posterior predictive distribution to a truncated Gaussian PPD. In other words, one can train the GP on the normal training data, but then just plug the moments of the GP (the mean and cov) into the truncated Gaussian distribution along with the bounds and recover a PDF over the truncated range. In the case of requiring y>0 you can set the lower bound on the GP to be a=0. This is a cute solution too because your truncation bounds can vary with input x.

You just have to be careful since the +-68% confidence intervals (the sigma levels) are not symmetric about the mean prediction, and furthermore the mean prediction is no longer the median of the PPD. It is straightforward to compute the median of the truncated Gaussian, though.

Here is an example figure showing the tutorial GP with a truncated PPD to exist only in y>0. The black points, green line, and gray lines are all the same as the tutorial. The red line shows the mean of the truncated GP PPD and the red shaded regions show the 68% confidence interval (not the standard deviation).

@dfm @davidwhogg if you think this might be worth a PR I'd be happy to implement it.

lleewwiiss · 2020-07-16T23:27:37Z

Also something worth considering https://arxiv.org/abs/2004.04632.pdf

tmcclintock · 2020-07-16T23:36:05Z

@lleewwiiss good find. It turns out that because of the nice properties of Gaussians, you can construct even more sophisticated constrained GPs (like ones where the slope or convexity of the GP is bounded). Here is a nice recent stats paper on the topic https://arxiv.org/abs/1703.00787.

lleewwiiss · 2020-07-16T23:57:27Z

@tmcclintock Interesting I will check it out, do you have an example of how you implemented the above constraints? I tried by can't seem to work it out

tmcclintock · 2020-07-16T23:58:34Z

Oh the ones in the paper? I do not. Or did you mean the one I commented about in November?

lleewwiiss · 2020-07-16T23:59:35Z

The comment from November using the tutorial example

tmcclintock · 2020-07-17T00:14:29Z

Ah I see. Not on hand, but I'll whip up a gist and ping you.

scipy.stats.truncnorm is very unwieldy and not amenable to working with a GP since the loc and scale arguments need to change over the domain of the prediction, hence a vectorized implementation might be nice to make.

tmcclintock · 2020-07-17T01:44:08Z

@lleewwiiss here you go: https://github.com/tmcclintock/TruncatedGP
It ended up being too long for a gist. I show the posteriors for the truncated and non-truncated GPs, as well as draws from both posteriors.

If you look at the code you will see that you need to do some annoying transformations because of the way scipy.stats.truncnorm works. The upside is that my implementation allows for arbitrary bounds a(x) and b(x).

Drop me a line if you feel like spinning this out into a real project.

lleewwiiss · 2020-07-17T01:49:01Z

@tmcclintock Thank you! I will check it out now, and credit you where I can

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Constraint the GP fit to be strictly positive #106

Constraint the GP fit to be strictly positive #106

maria-vincenzi commented Nov 24, 2018

dfm commented Nov 24, 2018 via email

davidwhogg commented Nov 25, 2018

tmcclintock commented Nov 21, 2019

lleewwiiss commented Jul 16, 2020 •

edited

tmcclintock commented Jul 16, 2020

lleewwiiss commented Jul 16, 2020

tmcclintock commented Jul 16, 2020

lleewwiiss commented Jul 16, 2020

tmcclintock commented Jul 17, 2020

tmcclintock commented Jul 17, 2020

lleewwiiss commented Jul 17, 2020

Constraint the GP fit to be strictly positive #106

Constraint the GP fit to be strictly positive #106

Comments

maria-vincenzi commented Nov 24, 2018

dfm commented Nov 24, 2018 via email

davidwhogg commented Nov 25, 2018

tmcclintock commented Nov 21, 2019

lleewwiiss commented Jul 16, 2020 • edited

tmcclintock commented Jul 16, 2020

lleewwiiss commented Jul 16, 2020

tmcclintock commented Jul 16, 2020

lleewwiiss commented Jul 16, 2020

tmcclintock commented Jul 17, 2020

tmcclintock commented Jul 17, 2020

lleewwiiss commented Jul 17, 2020

lleewwiiss commented Jul 16, 2020 •

edited