Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chp_14 potentially incorrect covariance functions #164

Open
rasoolianbehnam opened this issue Oct 26, 2021 · 2 comments
Open

Chp_14 potentially incorrect covariance functions #164

rasoolianbehnam opened this issue Oct 26, 2021 · 2 comments

Comments

@rasoolianbehnam
Copy link

rasoolianbehnam commented Oct 26, 2021

I could be wrong but I believe in Chp_14.ipynb cell 72, the covariance function
SIGMA = etasq * pm.gp.cov.Exponential(input_dim=1, ls=rhosq)
is not correct since it only takes the first column of Dmat_ord I think a correct covariance function would be something like below:

    def __init__(self, etasq, rhosq):
        self.etasq = etasq
        self.rhosq = rhosq
        
    def __call__(self, X):
        return self.etasq * tt.exp(-self.rhosq * X) + tt.diag(np.ones(len(X)) * .01)

a similar problem is also present in cell 55.

@francescolosterzo
Copy link

francescolosterzo commented Mar 23, 2022

Even if taking it from a different angle, I have been fighting with this covariance function for a few days now, and indeed what you proposed here solved my problem (as far as I understand).

I have been working on the exercises and ex. 14M5 asks to reimplement the same model. If I redo the implementation proposed in cell 72 of Chp_14.ipynb it works fine, but then I tried to implement the model without writing a custom class for the mean, but simply using the same form as for the previous models and just change the covariance matrix. Here is what I came up with:

m_14m5b = pm.Model()

with m_14m5b:
    
    a = pm.Normal('a', mu=0, sigma=1)
    bB = pm.Normal('bB', mu=0, sigma=0.5)
    bM = pm.Normal('bM', mu=0, sigma=0.5)
    
    eta_sq = pm.Exponential('eta_sq', lam=1)
    rho_sq = pm.Normal('rho_sq', mu=3, sigma=0.25)
    
    mu = a + bB * B + bM * M
    SIGMA = eta_sq * pm.gp.cov.Exponential(input_dim=1, ls=rho_sq)(Dmat_ord)
    
    G = pm.MvNormal('G', mu=mu, cov=SIGMA, observed=G_)
    
    m_14m5b_trace = pm.sample(return_inferencedata=True)    

With this model I kept getting sampling errors:

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'a': array(0.), 'bB': array(0.), 'bM': array(0.), 'eta_sq_log__': array(-0.36651292), 'rho_sq': array(3.)}

Initial evaluation results:
a              -0.92
bB             -0.23
bM             -0.23
eta_sq_log__   -1.06
rho_sq          0.47
G               -inf
Name: Log-probability of test_point, dtype: float64

and I indeed suspected that it was due to the lack of a noise term in the covariance since I got the same error with the original model when I set noise=0 in gp.marginal_likelihood.

I then appended tt.diag(np.ones(len(X)) * .01) to the definition of SIGMA in the model above and it worked!

@daniel-saunders-phil
Copy link

daniel-saunders-phil commented Mar 13, 2023

Hi, I also got stuck on this. It seems like the notebook changed at some point in the last two years because the current code looks quite a bit different. But the bug seems to have stuck around. It looks like the simplest fix, in terms of keeping as much of the pymc code as possible is gp.cov.ExpQuad(input_dim = 10) instead of input_dim = 1. I think this arises because McElreath uses a 10x10 matrix to represent distances whereas most pymc GP examples use a 1x100 column vector to represent distances. I tried the column vector version too and it looks like it would force us to rewrite a lot of plotting code. Chaning the input_dims also improves how well our estimates match the reported estimates in the 2nd edition of statistical rethinking. I'll make a pull request to fix this.

with pm.Model() as m14_8:
    a = pm.Exponential("a", 1.0)
    b = pm.Exponential("b", 1.0)
    g = pm.Exponential("g", 1.0)

    etasq = pm.Exponential("etasq", 2.0)
    ls_inv = pm.HalfNormal("ls_inv", 2.0)
    rhosq = pm.Deterministic("rhosq", 0.5 * ls_inv**2)

    # Implementation with PyMC's GP module:
    cov = etasq * pm.gp.cov.ExpQuad(input_dim=10, ls_inv=ls_inv) # change input_dim=10
    gp = pm.gp.Latent(cov_func=cov)
    k = gp.prior("k", X=Dmatsq)

    lam = (a * P**b / g) * tt.exp(k[society])

    T = pm.Poisson("total_tools", lam, observed=tools)

    trace_14_8 = pm.sample(4000, tune=2000, target_accept=0.99, random_seed=RANDOM_SEED)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants