You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am experimenting the use of the HODLR solver, but unfortunately I have seen some strange behaviour that at the moment prevents me from using it. Apparently, the computation of the log likelihood shows in various situations some apparent errors that I cannot really understand.
Consider the code (very similar to the tutorial on the big data):
import numpy as np
import george
np.random.seed(123)
n = 200
x = np.random.uniform(0, 10, n)
yerr = 0.1 * np.random.rand(n)
y = np.sin(x) + yerr * np.random.randn(n)
kernel = 1.0 * george.kernels.ExpSquaredKernel(1.0)
gp_basic = george.GP(kernel)
gp_basic.compute(x, yerr)
print(gp_basic.log_likelihood(y))
The printed result is 319.7841482562233. However, when using the the HODLRSolver:
the result is 296.8650016184164, that is quite off. Strangely enough, if I set in the code above n = 199 everything is back to normal and the HODLR solver gives results very close to the basic solver.
Is this expected and normal? In more complicated cases, it seems to me that this erratic behaviour can affect the convergence of the algorithm during the optimization of the hyperparameters.
The text was updated successfully, but these errors were encountered:
The HODLR solver is stochastic so it can be quite inaccurate especially when the system is not well conditioned. Generally this doesn't have a huge effect on hyperparameter inference, but these days I'd probably recommend using a different library if you need scalability because there has been a lot of work done on developing better tools. For example:
If your problem is 1D (like that sample code), I'd recommend celerite. The interface is about the same, but it'll be orders of magnitude faster and numerically stable.
If you need to work in higher dimensions, the best library that I know about is GPyTorch. It has several algorithms that should be fast and scale well, although I haven't used it much myself.
Thank you for your quick reply! I am still surprised by the behaviour oof the HODLR solver: I understand it is stochastic, but find it hard to comprehend why if fails completely when the number of points goes from 199 to 200 (in a case that, to me, looks quite well conditioned).
Anyway, I will follow your suggestions and give a try to GPyTorch (my problem is 2D). Thank you again!
I am experimenting the use of the HODLR solver, but unfortunately I have seen some strange behaviour that at the moment prevents me from using it. Apparently, the computation of the log likelihood shows in various situations some apparent errors that I cannot really understand.
Consider the code (very similar to the tutorial on the big data):
The printed result is 319.7841482562233. However, when using the the
HODLRSolver
:the result is 296.8650016184164, that is quite off. Strangely enough, if I set in the code above
n = 199
everything is back to normal and the HODLR solver gives results very close to the basic solver.Is this expected and normal? In more complicated cases, it seems to me that this erratic behaviour can affect the convergence of the algorithm during the optimization of the hyperparameters.
The text was updated successfully, but these errors were encountered: