Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive memory use creating OrdinaryKriging object (even if not estimating variogram) #264

Open
fiftysevendegreesofrad opened this issue Dec 12, 2022 · 1 comment

Comments

@fiftysevendegreesofrad
Copy link

fiftysevendegreesofrad commented Dec 12, 2022

Hi, this looks to be a great library, however I wonder if I'm using it wrong?

I would like to krige based on a large number of data points. I am not estimating the variogram, and when it comes to execute() I plan to restrict to n_closest_points. The code below fails however with a memory error, as it seems to be trying to compute a distance matrix for my input points - I'm not sure why this is necessary if the variogram parameters are provided already?

print(data.shape) # outputs (600000,3)

OK = pykrige.ok.OrdinaryKriging(
    data[:, 0],
    data[:, 1],
    data[:, 2],
    variogram_model="gaussian",
    variogram_parameters = {"sill":fit_model.sill,"range":fit_model.len_scale,"nugget":fit_model.nugget}
)

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-31-86755f124e9f> in <module>
     15 
     16 #for some reason still tries to fit and runs out of memory
---> 17 OK = pykrige.ok.OrdinaryKriging(
     18     data[:, 0],
     19     data[:, 1],

~\Anaconda2\envs\bayesiandrape\lib\site-packages\pykrige\ok.py in __init__(self, x, y, z, variogram_model, variogram_parameters, variogram_function, nlags, weight, anisotropy_scaling, anisotropy_angle, verbose, enable_plotting, enable_statistics, coordinates_type, exact_values, pseudo_inv, pseudo_inv_type)
    319             self.semivariance,
    320             self.variogram_model_parameters,
--> 321         ) = _initialize_variogram_model(
    322             np.vstack((self.X_ADJUSTED, self.Y_ADJUSTED)).T,
    323             self.Z,

~\Anaconda2\envs\bayesiandrape\lib\site-packages\pykrige\core.py in _initialize_variogram_model(X, y, variogram_model, variogram_model_parameters, variogram_function, nlags, weight, coordinates_type)
    457     # to calculate semivariances...
    458     if coordinates_type == "euclidean":
--> 459         d = pdist(X, metric="euclidean")
    460         g = 0.5 * pdist(y[:, None], metric="sqeuclidean")
    461 

~\Anaconda2\envs\bayesiandrape\lib\site-packages\scipy\spatial\distance.py in pdist(X, metric, out, **kwargs)
   2231         if metric_info is not None:
   2232             pdist_fn = metric_info.pdist_func
-> 2233             return pdist_fn(X, out=out, **kwargs)
   2234         elif mstr.startswith("test_"):
   2235             metric_info = _TEST_METRICS.get(mstr, None)

MemoryError: Unable to allocate 1.31 TiB for an array with shape (179999700000,) and data type float64
@MuellerSeb
Copy link
Member

That is indeed strange. And I am wondering why this issue was never raised before. If the parameters are given, the empirical variogram shouldn't be calculated.

Thanks for pointing this out. This is a very old bug (~9 years: https://github.com/GeoStat-Framework/PyKrige/blame/v1.3.1/pykrige/core.py#L186)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants