Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numerical issues #145

Open
MikeInnes opened this issue Nov 20, 2020 · 3 comments
Open

Numerical issues #145

MikeInnes opened this issue Nov 20, 2020 · 3 comments

Comments

@MikeInnes
Copy link

Stheno is great and I'm enjoying it a lot so far.

In some cases I've found that it's easy to run into numerical issues when getting the logpdf, when the distance between x values is smaller than the length scale:

julia> using Stheno, LinearAlgebra

julia> f = GP(EQ(), GPC())
GP{Stheno.ZeroMean{Float64},EQ}(Stheno.ZeroMean{Float64}(), EQ(), 1, GPC(1))

julia> x = range(0, 1, length = 15)
0.0:0.02040816326530612:1.0

julia> cholesky(cov(f(x)))
ERROR: PosDefException: matrix is not positive definite; Cholesky factorization failed.

AFAIU the covariance matrix should be positive definite by construction since the GP kernel is. It seems like the problem may be that in this case the eigenvalues are too small – not actually zero, but small enough to confuse the cholesky routine. (Does that sound right?)

Is there any more direct or numerically stable way to get the GP's logpdf that could avoid this issue?

@willtebbutt
Copy link
Member

Glad to hear that you're enjoying it Mike :)

It seems like the problem may be that in this case the eigenvalues are too small – not actually zero, but small enough to confuse the cholesky routine. (Does that sound right?)

This diagnosis is exactly right -- you wind up with numerically negative eigenvalues, and the Cholesky factorisation breaks down.

The standard thing to do is add a small amount of "jitter" to the covariance matrix C. So instead of computing

cholesky(C)

you compute

cholesky(C + 1e-9I)

or something like that.

This is sufficiently common that I made it possible to do this really easily in Stheno. You should interpret

f(x, 1e-9)

as "the multivariate Normal given by the sum of f at x and iid Gaussian noise with variance 1e-9". In my experience, it's fine to do this. So

cholesky(cov(f(x, 1e-9)))

should work just fine (hopefully).

@MikeInnes
Copy link
Author

Aaaah, ok. I've actually used the additional noise term in a bunch of places, but hadn't at all connected this numerical issue to the lack of a noise term. You've saved me from a real rabbit hole with your quick response, thank you!

Feel free to close this, or alternatively we can consider it a docs issue (I might have some time to add a "troubleshooting" section at some point, which might be useful for beginners like myself).

@willtebbutt
Copy link
Member

A docs issue would be an excellent idea, particularly if you have time to write it! Let's keep it open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants