Inconsistent results with HODLRSolver #128

astrozot · 2020-06-07T11:43:32Z

I am experimenting the use of the HODLR solver, but unfortunately I have seen some strange behaviour that at the moment prevents me from using it. Apparently, the computation of the log likelihood shows in various situations some apparent errors that I cannot really understand.

Consider the code (very similar to the tutorial on the big data):

import numpy as np
import george

np.random.seed(123)
n = 200
x = np.random.uniform(0, 10, n)
yerr = 0.1 * np.random.rand(n)
y = np.sin(x) + yerr * np.random.randn(n)

kernel = 1.0 * george.kernels.ExpSquaredKernel(1.0)

gp_basic = george.GP(kernel)
gp_basic.compute(x, yerr)
print(gp_basic.log_likelihood(y))

The printed result is 319.7841482562233. However, when using the the HODLRSolver:

gp_hodlr = george.GP(kernel, solver=george.HODLRSolver, seed=42)
gp_hodlr.compute(x, yerr)
print(gp_hodlr.log_likelihood(y))

the result is 296.8650016184164, that is quite off. Strangely enough, if I set in the code above n = 199 everything is back to normal and the HODLR solver gives results very close to the basic solver.

Is this expected and normal? In more complicated cases, it seems to me that this erratic behaviour can affect the convergence of the algorithm during the optimization of the hyperparameters.

The text was updated successfully, but these errors were encountered:

dfm · 2020-06-08T18:26:43Z

The HODLR solver is stochastic so it can be quite inaccurate especially when the system is not well conditioned. Generally this doesn't have a huge effect on hyperparameter inference, but these days I'd probably recommend using a different library if you need scalability because there has been a lot of work done on developing better tools. For example:

If your problem is 1D (like that sample code), I'd recommend celerite. The interface is about the same, but it'll be orders of magnitude faster and numerically stable.
If you need to work in higher dimensions, the best library that I know about is GPyTorch. It has several algorithms that should be fast and scale well, although I haven't used it much myself.

Hope this helps!

astrozot · 2020-06-10T20:23:14Z

Thank you for your quick reply! I am still surprised by the behaviour oof the HODLR solver: I understand it is stochastic, but find it hard to comprehend why if fails completely when the number of points goes from 199 to 200 (in a case that, to me, looks quite well conditioned).

Anyway, I will follow your suggestions and give a try to GPyTorch (my problem is 2D). Thank you again!

This was referenced Aug 16, 2020

general matern kernels #134

Open

Transition over from george to GPyTorch oceanhackweek/ohw20-proj-argo-mapping#23

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent results with HODLRSolver #128

Inconsistent results with HODLRSolver #128

astrozot commented Jun 7, 2020

dfm commented Jun 8, 2020

astrozot commented Jun 10, 2020

Inconsistent results with HODLRSolver #128

Inconsistent results with HODLRSolver #128

Comments

astrozot commented Jun 7, 2020

dfm commented Jun 8, 2020

astrozot commented Jun 10, 2020