/
Chap5_Lab1.Rnw
executable file
·236 lines (174 loc) · 12.1 KB
/
Chap5_Lab1.Rnw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
\documentclass[b5paper,11pt]{book}
\input{preamble}
\ifthenelse{\equal{\jobname}{\detokenize{Chap5_Lab1}}}
{\standalonetrue}
{\standalonefalse}
\begin{document}
<<echo=FALSE>>=
opts_chunk$set(background = rgb(1,1,0.85),
size='small')
@
\addcontentsline{toc}{section}{Lab 5.1: Implementing an IDE Model in One-Dimensional Space}
\section*{Lab 5.1: Implementing an IDE Model in One-Dimensional Space}\label{lab:5_IDEsim}
\markright{LAB 5.1: IMPLEMENTING AN IDE MODEL IN ONE-DIMENSIONAL SPACE}
In this Lab we take a look at how one can implement a stochastic integro-difference equation (IDE) in one-dimensional space and time, from first principles. Specifically, we shall consider the dynamic model,
$$Y_{t}(s) = \int_{D_s}m(s,x;\bftheta_p)Y_{t-1}(x) \intd x + \eta_t(s),\quad s,x \in D_s,$$
\noindent where $Y_t(\cdot)$ is the spatio-temporal process at time $t$; $\bftheta_p$ are parameters that we fix (in practice, these will be estimated from data; see Lab 5.2); and $\eta_t(\cdot)$ is a spatial process, inde\-pend\-ent of $Y_t(\cdot)$, with covariance function that we shall assume is known.
We only need the packages {\bf dplyr}, {\bf ggplot2}, and {\bf STRbook} for this lab and, for reproducibility purposes, we fix the seed.
<<message=FALSE, warning=FALSE>>=
library("dplyr")
library("ggplot2")
library("STRbook")
set.seed(1)
@
<<echo = FALSE, warning=FALSE, message=FALSE>>=
library("gridExtra")
@
\subsection*{Constructing the Process Grid and Kernel}
We start off by constructing a discretization of the one-dimensional spatial domain $D_s = [0,1]$. We shall use this discretization, containing cells of width $\Delta_s$, for both approximate integrations as well as visualizations. We call this our spatial grid.
<<>>=
ds <- 0.01
s_grid <- seq(0, 1, by = ds)
N <- length(s_grid)
@
\noindent Our space-time grid is formed by calling $\fn{expand.grid}$ with \cc{s\_grid} and our temporal domain, which we define as the set of integers spanning 0 up to $T = 200$.
<<>>=
nT <- 201
t_grid <- 0:(nT-1)
st_grid <- expand.grid(s = s_grid, t = t_grid)
@
The transition kernel $m(s,x; \bftheta_p)$ is a bivariate function on our spatial grid. It is defined below to be a Gaussian kernel, where the entries of $\bftheta_p = (\theta_{p,1},\theta_{p,2},\theta_{p,3})'$ are the amplitude, the scale (aperture, twice the variance), and the shift (offset) of the kernel, respectively. Specifically,
$$
m(s,x; \bftheta_p) \equiv \theta_{p,1}\exp\left(-\frac{1}{\theta_{p,2}}(x - \theta_{p,3} - s)^2\right),
$$
which can be implemented as an \R~function as follows.
<<>>=
m <- function(s, x, thetap) {
gamma <- thetap[1] # amplitude
l <- thetap[2] # length scale
offset <- thetap[3] # offset
D <- outer(s + offset, x, '-') # displacements
gamma * exp(-D^2/l) # kernel eval.
}
@
\noindent Note the use of the function \fn{outer} with the subtraction operator. This function performs an ``outer operation'' (a generalization of the outer product) by computing an operation between every two elements of the first two arguments, in this case a subtraction.
We can now visualize some kernels by seeing how the process at $s = 0.5$ depends on $x$. Four such kernels are constructed below: the first is narrow and centered on 0.5; the second is slightly wider; the third is shifted to the right; and the fourth is shifted to the left. We store the parameters of the four different kernels in a list \cc{thetap}.
<<>>=
thetap <- list()
thetap[[1]] <- c(40, 0.0002, 0)
thetap[[2]] <- c(5.75, 0.01, 0)
thetap[[3]] <- c(8, 0.005, 0.1)
thetap[[4]] <- c(8, 0.005, -0.1)
@
\noindent Plotting proceeds by first evaluating the kernel for all $x$ at $s = 0.5$, and then plotting these evaluations against $x$. The first kernel is plotted below; plotting the other three is left as an exercise for the reader. \ifstandalone\else The kernels are shown in the top panels of Figure~\ref{fig:DSTMdynamics}.\fi
<<eval = FALSE>>=
m_x_0.5 <- m(s = 0.5, x = s_grid, # construct kernel
thetap = thetap[[1]]) %>% # at s = 0.5
as.numeric() # convert to numeric
df <- data.frame(x = s_grid, m = m_x_0.5) # allocate to df
ggplot(df) + geom_line(aes(x, m)) + theme_bw() # plot
@
<<echo = FALSE, results = 'hide', fig.keep = 'none'>>=
mplot <- list()
label <- c("(a)", "(b)", "(c)", "(d)")
for(i in 1:4) {
m_x_0.5 <- m(s = 0.5, x = s_grid, thetap = thetap[[i]]) %>% as.numeric()
df <- data.frame(x = s_grid, m = m_x_0.5)
mplot[[i]] <- ggplot(df) + geom_line(aes(x, m)) + theme_bw() +
ggtitle(label[i])
}
@
The last term we need to define is $\eta_t(\cdot)$. Here, we define it as a spatial process with an exponential covariance function with range parameter 0.1 and variance 0.1. The covariance matrix at each time point is then
<<>>=
Sigma_eta <- 0.1 * exp( -abs(outer(s_grid, s_grid, '-') / 0.1))
@
Simulating $\eta_t(s)$ over \cc{s\_grid} proceeds by generating a multivariate Gaussian vector with mean zero and covariance matrix \cc{Sigma\_eta}. To do this, one can use the function \fn{mvrnorm} from the package {\bf MASS}. Alternatively, one may use the lower Cholesky factor of \cc{Sigma\_eta} and multiply this by a vector of numbers generated from a mean-zero, variance-one, independent-elements Gaussian random vector \cite[see][Algorithm 2.3]{rue2005gaussian}\index[aut]{Rue, H.}\index[aut]{Held, L.}.
<<>>=
L <- t(chol(Sigma_eta)) # chol() returns upper Cholesky factor
sim <- L %*% rnorm(nrow(Sigma_eta)) # simulate
@
\noindent Type \fn{plot}\cc{(s\_grid, sim, }\strn{\textquotesingle l\textquotesingle}\cc{)} to plot this realization of $\eta_t(s)$ over \cc{s\_grid}.
\subsection*{Simulating from the IDE}
Now we have everything in place to simulate from the IDE. Simulation is most easily carried out using a \texttt{for} loop as shown below. We shall carry out four simulations, one for each kernel constructed above, and store the simulations in a list of four data frames, one for each simulation. The following command initializes this list.
<<>>=
Y <- list()
@
For each simulation setting (which we iterate using the index \cc{i}), we simulate the time points (which we iterate using \cc{j}) to obtain the process. The ``nested \texttt{for} loop'' below accomplishes this. In the outer loop, the kernel is constructed and the process is initialized to zero. In the inner loop, the integration is approximated using a Riemann sum,
$$
\int_{D_s}m(s,x;\bftheta_p)Y_{t-1}(x) \intd x \approx \sum_i m(s,x_i;\bftheta_p)Y_{t-1}(x_i)\Delta_s ,
$$
where we recall that we have set $\Delta_s = 0.01$. Next, at every time point $\eta_t(s)$ is simulated on the grid and added to the sum (an approximation of the integral) above.
<<message=FALSE, warning = FALSE>>=
for(i in 1:4) { # for each kernel
M <- m(s_grid, s_grid, thetap[[i]]) # construct kernel
Y[[i]] <- data.frame(s = s_grid, # init. data frame with s
t = 0, # init. time point 0, and
Y = 0) # init. proc. value = 0
for(j in t_grid[-1]) { # for each time point
prev_Y <- filter(Y[[i]], # get Y at t - 1
t == j - 1)$Y
eta <- L %*% rnorm(N) # simulate eta
new_Y <- (M %*% prev_Y * ds + eta) %>%
as.numeric() # Euler approximation
Y[[i]] <- rbind(Y[[i]], # update data frame
data.frame(s = s_grid,
t = j,
Y = new_Y))
}
}
@
\noindent Repeatedly appending data frames, as is done above, is computationally inefficient. For large systems it would be quicker to save a data frame for each time point in another list and then concatenate using \fn{rbindlist} from the package {\bf data.table}.
Since now \cc{Y[[i]]}, for \cc{i}\,$=1,\ldots,4$, contains a data frame in long format, it is straightforward to visualize. The code given below constructs the Hovm\"{o}ller plot for the IDE process for \cc{i}\,$=1$. Plotting for \cc{i}\,$=2,3,4$ is left as an exercise for the reader. \ifstandalone\else The resulting plots are shown in the bottom panels of Figure~\ref{fig:DSTMdynamics}.\fi
<<eval= FALSE>>=
ggplot(Y[[1]]) + geom_tile(aes(s, t, fill = Y)) +
scale_y_reverse() + theme_bw() +
fill_scale(name = "Y")
@
<<echo = FALSE, results = 'hide', fig.keep = 'none'>>=
g <- list()
for(i in 1:4) {
g[[i]] <- ggplot(Y[[i]]) + geom_tile(aes(s, t, fill = Y)) +
scale_y_reverse() + theme_bw() +
scale_fill_distiller(palette = "Spectral") +
theme(legend.position = "bottom") + coord_fixed(ratio = 0.0065)
}
gplot <- grid.arrange( arrangeGrob(mplot[[1]], mplot[[2]], mplot[[3]], mplot[[4]], nrow = 1), arrangeGrob(g[[1]],g[[2]], g[[3]], g[[4]], nrow = 1), heights = c(1,2))
ggsave(gplot, file = "img/Chapter_5/IDEkernels.png", width = 12, height = 6, dpi = 300)
@
\subsection*{Simulating Observations}
Now assume that we want to simulate noisy observations from one of the process models that we have just simulated from. Why would we want to do this? Frequently, the only way to test whether algorithms for inference are working as they should is to mimic both the underlying true process \emph{and} the measurement process. Working with simulated data is the first step in developing reliable algorithms that are then ready to be applied to real data.
To map the observations to the data we need an incidence matrix that picks out the process value that has been observed. This incidence matrix is simply composed of several rows, one for each observation, with zeros everywhere except for the entry corresponding to the process value that has been observed\ifstandalone.\else (recall Section \ref{sec:STDMproc}).\fi When the locations we are observing change over time, the incidence matrix correspondingly changes over time.
Suppose that at each time point we observe the process at 50 locations which, for convenience, are a subset of \cc{s\_grid}. (If this is not the case, some nearest-neighbor mapping or deterministic interpolation method can be used.)
<<>>=
nobs <- 50
sobs <- sample(s_grid, nobs)
@
\noindent Then the incidence matrix at time $t$, $\bH_t$, can be constructed by matching the observation locations on the space-time grid using the function \fn{which}.
<<>>=
Ht <- matrix(0, nobs, N) # construct empty matrix
for(i in 1:nobs) { # for each obs
idx <- which(sobs[i] == s_grid) # find the element to set to 1
Ht[i, idx] <- 1 # set to 1
}
@
\noindent Note that \cc{Ht} is sparse (contains many zeros), so sparse-matrix representations can be used to improve computational and memory efficiency; look up the packages {\bf Matrix} or {\bf spam} for more information on these representations.
We can repeat this procedure for every time point to simulate our data. At time $t$, the data are given by $\bZ_t = \bH_t\bY_t + \bfepsilon_t$, where $\bY_t$ is the latent process on the grid at time $t$, and $\bfepsilon_t$ is independent of $\bY_t$ and represents a Gaussian random vector whose entries are $iid$ with mean zero and variance $\sigma^2_\epsilon$. Assume $\sigma^2_\epsilon = 1$ and that $\bH_t$ is the same for each $t$. Then observations are simulated using the following \texttt{for} loop.
<<>>=
z_df <- data.frame() # init data frame
for(j in 0:(nT-1)) { # for each time point
Yt <- filter(Y[[1]], t == j)$Y # get the simulated process
zt <- Ht %*% Yt + rnorm(nobs) # map to obs and add noise
z_df <- rbind(z_df, # update data frame
data.frame(s = sobs, t = j, z = zt))
}
@
\noindent Plotting of the simulated observations proceeds using {\bf ggplot2} as follows.
<<results = 'hide', fig.keep = 'none'>>=
ggplot(z_df) + geom_point(aes(s, t, colour = z)) +
col_scale(name = "z") + scale_y_reverse() + theme_bw()
@
\noindent Note that the observations are noisy and reveal sizeable gaps. Filling in these gaps by first estimating all the parameters in the IDE from the data and then predicting at unobserved locations is the subject of Lab 5.2.
\ifstandalone
\bibliography{LitReviewBib,LitReviewR}
\fi
\end{document}