Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different results pydlm vs dlm package (in R) #24

Open
roinaveiro opened this issue Nov 29, 2017 · 1 comment
Open

Different results pydlm vs dlm package (in R) #24

roinaveiro opened this issue Nov 29, 2017 · 1 comment

Comments

@roinaveiro
Copy link

I get different results in an extremely simple dynamic linear model, using dlm package in R and pydlm.
In particular, I create a random walk + noise dlm (that is dlm with just a trend of degree 0 or 1, depending on the version of the pydlm package). I fix the discount factor to be 1, that is W_t = 0 for all t (that is, the variance of the noise of the state evolution equation is null). The noisePrior is 1.0 by default. I fix the prior variance to be 2.0. I include just one observation: 1.70. With this I build the dlm:

Here the code:

from pydlm import *

d = dlm([1.70])
t = trend(degree = 0, discount=1, w = 2)
d = d + t
d.fitForwardFilter()
print(d.getLatentState())
[[1.1333333333333333]]
print(d.getLatentCov())
[matrix([[ 0.65444444]])]

Doing the same in R, I obtain the same latent state, but different state covariance. Particularly I obtain 0.666666.

In fact, performing the calculations by hand, I obtain the same as in R. This is quite easy to reproduce, following the formulas in page 53 of this book: Dynamic Linear Models with R, Giovanni Petris, Sonia Petrone, Patrizia Campagnoli.

Am I misunderstanding the meaning of some parameter?

@wwrechard
Copy link
Owner

Hi roinaveiro,

Thanks for raising this issue. Your understanding is correct, but do note that the underlying assumption is different here between pydlm and the R code.

Based on Page 53 of the book, what the R code doing is similar to pydlm with a key difference that it assumes the observation noise (v_t) is fixed to be 1. While in pydlm, the observation noise (noisePrior) is initialized at 1 but then it will be learned from the data and changing over time (You can tell this from the parameter name). Therefore, the result you get from R will be different from pydlm due to this self-learning property.

One way to get a consistent result using pydlm is to just fix the observation noise or set the initial confidence on the noisePrior to be high so the noise value won't get updated. However, this parameter is current not exposed to the user. Please do let me know if you think it is an important feature to you (i.e., tell the model not to learn observational noise from the data).

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants