Chap5_Lab1.Rnw

\documentclass[b5paper,11pt]{book}
\input{preamble}

\ifthenelse{\equal{\jobname}{\detokenize{Chap5_Lab1}}}
{\standalonetrue}
{\standalonefalse}

\begin{document}

<<echo=FALSE>>=
opts_chunk$set(background = rgb(1,1,0.85),
               size='small')
@

\addcontentsline{toc}{section}{Lab 5.1: Implementing an IDE Model in One-Dimensional Space}
\section*{Lab 5.1: Implementing an IDE Model in One-Dimensional Space}\label{lab:5_IDEsim}
\markright{LAB 5.1: IMPLEMENTING AN IDE MODEL IN ONE-DIMENSIONAL SPACE}

In this Lab we take a look at how one can implement a stochastic integro-difference equation (IDE) in one-dimensional space and time, from first principles. Specifically, we shall consider the dynamic model,


$$Y_{t}(s) = \int_{D_s}m(s,x;\bftheta_p)Y_{t-1}(x) \intd x + \eta_t(s),\quad s,x \in D_s,$$

\noindent where $Y_t(\cdot)$ is the spatio-temporal process at time $t$; $\bftheta_p$ are parameters that we fix (in practice, these will be estimated from data; see Lab 5.2); and $\eta_t(\cdot)$ is a spatial process, inde\-pend\-ent of $Y_t(\cdot)$, with covariance function that we shall assume is known.

We only need the packages {\bf dplyr}, {\bf ggplot2}, and {\bf STRbook} for this lab and, for reproducibility purposes, we fix the seed.

<<message=FALSE, warning=FALSE>>=
library("dplyr")
library("ggplot2")
library("STRbook")
set.seed(1)
@

<<echo = FALSE, warning=FALSE, message=FALSE>>=
library("gridExtra")
@


\subsection*{Constructing the Process Grid and Kernel}

We start off by constructing a discretization of the one-dimensional spatial domain $D_s = [0,1]$. We shall use this discretization, containing cells of width $\Delta_s$, for both approximate integrations as well as visualizations. We call this our spatial grid.

<<>>=
ds <- 0.01
s_grid <- seq(0, 1, by = ds)
N <- length(s_grid)
@


\noindent Our space-time grid is formed by calling $\fn{expand.grid}$ with \cc{s\_grid} and our temporal domain, which we define as the set of integers spanning 0 up to $T = 200$.

<<>>=
nT <- 201
t_grid <- 0:(nT-1)
st_grid <- expand.grid(s = s_grid, t = t_grid)
@


The transition kernel $m(s,x; \bftheta_p)$ is a bivariate function on our spatial grid. It is defined below to be a Gaussian kernel, where the entries of $\bftheta_p = (\theta_{p,1},\theta_{p,2},\theta_{p,3})'$ are the amplitude, the scale (aperture, twice the variance), and the shift (offset) of the kernel, respectively. Specifically,
$$
m(s,x; \bftheta_p) \equiv \theta_{p,1}\exp\left(-\frac{1}{\theta_{p,2}}(x - \theta_{p,3} - s)^2\right),
$$
which can be implemented as an \R~function as follows.
<<>>=
m <- function(s, x, thetap) {
  gamma <- thetap[1]                 # amplitude
  l <- thetap[2]                     # length scale
  offset <- thetap[3]                # offset
  D <- outer(s + offset, x, '-')     # displacements
  gamma * exp(-D^2/l)                # kernel eval.
}
@
\noindent Note the use of the function \fn{outer} with the subtraction operator. This function performs an ``outer operation'' (a generalization of the outer product) by computing an operation between every two elements of the first two arguments, in this case a subtraction.

We can now visualize some kernels by seeing how the process at $s = 0.5$ depends on $x$. Four such kernels are constructed below: the first is narrow and centered on 0.5; the second is slightly wider; the third is shifted to the right; and the fourth is shifted to the left. We store the parameters of the four different kernels in a list \cc{thetap}.

<<>>=
thetap <- list()
thetap[[1]] <- c(40, 0.0002, 0)
thetap[[2]] <- c(5.75, 0.01, 0)
thetap[[3]] <- c(8, 0.005, 0.1)
thetap[[4]] <- c(8, 0.005, -0.1)
@

\noindent Plotting proceeds by first evaluating the kernel for all $x$ at $s = 0.5$, and then plotting these evaluations against $x$. The first kernel is plotted below; plotting the other three is left as an exercise for the reader. \ifstandalone\else The kernels are shown in the top panels of Figure~\ref{fig:DSTMdynamics}.\fi

<<eval = FALSE>>=
m_x_0.5 <- m(s = 0.5, x = s_grid,            # construct kernel
             thetap = thetap[[1]]) %>%       # at s = 0.5
           as.numeric()                      # convert to numeric
df <- data.frame(x = s_grid, m = m_x_0.5)      # allocate to df
ggplot(df) + geom_line(aes(x, m)) + theme_bw() # plot
@

<<echo = FALSE, results = 'hide', fig.keep = 'none'>>=
mplot <- list()
label <- c("(a)", "(b)", "(c)", "(d)")
for(i in 1:4)  {
  m_x_0.5 <- m(s = 0.5, x = s_grid, thetap = thetap[[i]]) %>% as.numeric()
  df <- data.frame(x = s_grid, m = m_x_0.5)
  mplot[[i]] <- ggplot(df) + geom_line(aes(x, m)) + theme_bw() +
  ggtitle(label[i])
}
@

The last term we need to define is $\eta_t(\cdot)$. Here, we define it as a spatial process with an exponential covariance function with range parameter 0.1 and variance 0.1. The covariance matrix at each time point is then

<<>>=
Sigma_eta <- 0.1 * exp( -abs(outer(s_grid, s_grid, '-') / 0.1))
@

Simulating $\eta_t(s)$ over \cc{s\_grid} proceeds by generating a multivariate Gaussian vector with mean zero and covariance matrix \cc{Sigma\_eta}. To do this, one can use the function \fn{mvrnorm} from the package {\bf MASS}. Alternatively, one may use the lower Cholesky factor of \cc{Sigma\_eta} and multiply this by a vector of numbers generated from a mean-zero, variance-one, independent-elements  Gaussian random vector \cite[see][Algorithm 2.3]{rue2005gaussian}\index[aut]{Rue, H.}\index[aut]{Held, L.}.

<<>>=
L <- t(chol(Sigma_eta))  # chol() returns upper Cholesky factor
sim <- L %*% rnorm(nrow(Sigma_eta))  # simulate
@

\noindent Type \fn{plot}\cc{(s\_grid, sim, }\strn{\textquotesingle l\textquotesingle}\cc{)} to plot this realization of $\eta_t(s)$ over \cc{s\_grid}.

\subsection*{Simulating from the IDE}

Now we have everything in place to simulate from the IDE. Simulation is most easily carried out using a \texttt{for} loop as shown below. We shall carry out four simulations, one for each kernel constructed above, and store the simulations in a list of four data frames, one for each simulation. The following command initializes this list.

<<>>=
Y <- list()
@

For each simulation setting (which we iterate using the index \cc{i}), we simulate the time points (which we iterate using \cc{j}) to obtain the process. The ``nested \texttt{for} loop'' below accomplishes this. In the outer loop, the kernel is constructed and the process is initialized to zero. In the inner loop, the integration is approximated using a Riemann sum,
$$
\int_{D_s}m(s,x;\bftheta_p)Y_{t-1}(x) \intd x \approx \sum_i m(s,x_i;\bftheta_p)Y_{t-1}(x_i)\Delta_s ,
$$
where we recall that we have set $\Delta_s = 0.01$. Next, at every time point $\eta_t(s)$ is simulated on the grid and added to the sum (an approximation of the integral) above.


<<message=FALSE, warning = FALSE>>=
for(i in 1:4) {                         # for each kernel
  M <- m(s_grid, s_grid, thetap[[i]])   # construct kernel
  Y[[i]] <- data.frame(s = s_grid,      # init. data frame with s
                       t = 0,           # init. time point 0, and
                       Y = 0)           # init. proc. value = 0
  for(j in t_grid[-1]) {                # for each time point
    prev_Y <- filter(Y[[i]],            # get Y at t - 1
                     t == j - 1)$Y
    eta <- L %*% rnorm(N)               # simulate eta
    new_Y <- (M %*% prev_Y * ds + eta) %>%
             as.numeric()               # Euler approximation

    Y[[i]] <- rbind(Y[[i]],             # update data frame
                    data.frame(s = s_grid,
                               t = j,
                               Y =  new_Y))
  }
}
@

\noindent Repeatedly appending data frames, as is done above, is computationally inefficient. For large systems it would be quicker to save a data frame for each time point in another list and then concatenate using \fn{rbindlist} from the package {\bf data.table}.

Since now \cc{Y[[i]]}, for \cc{i}\,$=1,\ldots,4$, contains a data frame in long format, it is straightforward to visualize. The code given below constructs the Hovm\"{o}ller plot for the IDE process for \cc{i}\,$=1$. Plotting for \cc{i}\,$=2,3,4$ is left as an exercise for the reader. \ifstandalone\else The resulting plots are shown in the bottom panels of Figure~\ref{fig:DSTMdynamics}.\fi

<<eval= FALSE>>=
ggplot(Y[[1]]) + geom_tile(aes(s, t, fill = Y)) +
    scale_y_reverse() + theme_bw() +
    fill_scale(name = "Y")
@


<<echo = FALSE, results = 'hide', fig.keep = 'none'>>=
g <- list()
for(i in 1:4) {
  g[[i]] <- ggplot(Y[[i]]) + geom_tile(aes(s, t, fill = Y)) +
    scale_y_reverse() + theme_bw() +
    scale_fill_distiller(palette = "Spectral") +
    theme(legend.position = "bottom") + coord_fixed(ratio =  0.0065)
}

gplot <- grid.arrange( arrangeGrob(mplot[[1]], mplot[[2]], mplot[[3]], mplot[[4]], nrow = 1), arrangeGrob(g[[1]],g[[2]], g[[3]], g[[4]], nrow = 1), heights = c(1,2))

ggsave(gplot, file = "img/Chapter_5/IDEkernels.png", width = 12, height = 6, dpi = 300)

@

\subsection*{Simulating Observations}

Now assume that we want to simulate noisy observations from one of the process models that we have just simulated from. Why would we want to do this? Frequently, the only way to test whether algorithms for inference are working as they should is to mimic both the underlying true process \emph{and} the measurement process. Working with simulated data is the first step in developing reliable algorithms that are then ready to be applied to real data.

To map the observations to the data we need an incidence matrix that picks out the process value that has been observed. This incidence matrix is simply composed of several rows, one for each observation, with zeros everywhere except for the entry corresponding to the process value that has been observed\ifstandalone.\else (recall Section \ref{sec:STDMproc}).\fi When the locations we are observing change over time, the incidence matrix correspondingly changes over time.

Suppose that at each time point we observe the process at 50 locations which, for convenience, are a subset of \cc{s\_grid}. (If this is not the case, some nearest-neighbor mapping or deterministic interpolation method can be used.)

<<>>=
nobs <- 50
sobs <- sample(s_grid, nobs)
@

\noindent Then the incidence matrix at time $t$, $\bH_t$, can be constructed by matching the observation locations on the space-time grid using the function \fn{which}.

<<>>=
Ht <- matrix(0, nobs, N)           # construct empty matrix
for(i in 1:nobs) {                 # for each obs
  idx <- which(sobs[i] == s_grid)  # find the element to set to 1
  Ht[i, idx] <- 1                  # set to 1
}
@

\noindent Note that \cc{Ht} is sparse (contains many zeros), so sparse-matrix representations can be used to improve computational and memory efficiency; look up the packages {\bf Matrix} or {\bf spam} for more information on these representations.

We can repeat this procedure for every time point to simulate our data. At time $t$, the data are given by $\bZ_t = \bH_t\bY_t + \bfepsilon_t$, where $\bY_t$ is the latent process on the grid at time $t$, and $\bfepsilon_t$ is independent of $\bY_t$ and represents a Gaussian random vector whose entries are $iid$ with mean zero and variance $\sigma^2_\epsilon$. Assume $\sigma^2_\epsilon = 1$ and that $\bH_t$ is the same for each $t$. Then observations are simulated using the following \texttt{for} loop.

<<>>=
z_df <- data.frame()               # init data frame
for(j in 0:(nT-1)) {               # for each time point
  Yt <- filter(Y[[1]], t == j)$Y   # get the simulated process
  zt <- Ht %*% Yt + rnorm(nobs)    # map to obs and add noise
  z_df <- rbind(z_df,              # update data frame
                data.frame(s = sobs, t = j, z = zt))
}
@

\noindent Plotting of the simulated observations proceeds using {\bf ggplot2} as follows.

<<results = 'hide', fig.keep = 'none'>>=
ggplot(z_df) + geom_point(aes(s, t, colour = z))  +
  col_scale(name = "z") + scale_y_reverse() + theme_bw()
@

\noindent Note that the observations are noisy and reveal sizeable gaps. Filling in these gaps by first estimating all the parameters in the IDE from the data and then predicting at unobserved locations is the subject of Lab 5.2.

\ifstandalone
\bibliography{LitReviewBib,LitReviewR}
\fi


\end{document}