labs/spatiotemporal_models/spatiotemporal_models.qmd

---
title: "Spatial and Spatio-temporal Modelling"
subtitle: "SHARP Bayesian Modeling for Environmental Health Workshop"
author: "Garyfallos Konstantinoudis"
date: "August 15 2023"
format: html
---

```{r, echo=FALSE, warning=FALSE, message=FALSE}
library(here)
library(tidyverse)
library(nimble)
library(sf)
library(posterior)
library(bayesplot)
library(spdep)
library(lubridate)
library(colorspace)

extrafont::loadfonts()
theme_set(hrbrthemes::theme_ipsum())
knitr::opts_chunk$set(fig.align = "center")

set.seed(2)
```

## Goal of this computing lab session

This goal of this lab is to use `NIMBLE` to carry out a disease mapping study.

## What's going to happen in this lab session?

During this lab session, we will:

1. Explore ways of visualizing spatial data;
2. Define the neighborhood matrix in `R`;
3. Fit and interpret the BYM model in `NIMBLE`; and
4. Perform spatial ecological regression

## Introduction

We will use the COVID-19 deaths during March-July 2020, in England, at the LTLA geographical level (317 areas), as taken from the published paper:

__Konstantinoudis G__, Padellini T, Bennett JE, Davies B, Ezzati M, Blangiardo M. _Long-term exposure to air-pollution and COVID-19 mortality in England: a hierarchical spatial analysis_. medRxiv [Preprint]. 2020 Aug 11:2020.08.10.20171421. doi: 10.1101/2020.08.10.20171421. Update in: Environ Int. 2021 Jan;146:106316. PMID: 32817974; PMCID: PMC7430619.

For that analysis, we included 38,573 COVID-19 deaths up to June 30, 2020 at the Lower Layer Super Output Area level in England ($n = 32844$ small areas).
We retrieved averaged NO$_2$ concentration during 2014-2018 from the Pollution Climate Mapping.
We used Bayesian hierarchical models to quantify the effect of air pollution while adjusting for a series of confounding and spatial autocorrelation.

We will build simple Bayesian models to try to understand what is happening in the data.
Once again we will use `NIMBLE` as the basis for our Bayesian model writing.

## Visualization of spatial areal data

Let's load in the data
```{r}
data_england <- read_sf(here("data", "England", "COVIDecoregression.shp"))
glimpse(data_england)
summary(data_england)
class(data_england)

data_england_simpler <- rgeos::gSimplify(as(data_england, "Spatial"), tol = 500)
data_england_simpler <- st_as_sf(data_england_simpler)
data_england_simpler <- cbind(data_england_simpler, data_england %>% mutate(geometry = NULL))
data_england <- data_england_simpler
```

As with the previous labs, let's use the package `ggplot2` to plot maps of the variables.

And for nicer maps for the other columns, including COVID-19 deaths:
```{r fig.width=11}
ggplot(data = data_england, aes(fill = deaths)) +
  geom_sf(colour = "white", size = 0.1) +
  labs(title = "COVID-19 deaths") +
  theme_void() +
  scale_fill_continuous_sequential(palette = "Reds")
```

the expected number of deaths, and:
```{r fig.width=11}
ggplot(data = data_england, aes(fill = expectd)) +
  geom_sf(colour = "white", size = 0.1) +
  labs("Expected number of deaths") +
  theme_void() +
  scale_fill_continuous_sequential(palette = "Blues")
```

and the SMR:
```{r fig.width=11}
ggplot(data = data_england, aes(fill = deaths / expectd)) +
  geom_sf(colour = "white", size = 0.1) +
  labs("Standardised mortality ratio") +
  theme_void() +
  scale_fill_continuous_divergingx(palette = "RdBu", mid = 1, rev = TRUE)
```

And the covariates, including total ICU beds,:
```{r fig.width=11}
ggplot(data = data_england, aes(fill = TtlICUB)) +
  geom_sf(colour = "white", size = 0.1) +
  labs(title = "Total ICU beds") +
  theme_void() +
  scale_fill_continuous_sequential(palette = "Greens")
```

NO$_2$,:
```{r fig.width=11}
ggplot(data = data_england, aes(fill = NO2)) +
  geom_sf(colour = "white", size = 0.1) +
  labs(title = "NO2") +
  theme_void() +
  scale_fill_continuous_sequential(palette = "Terrain")
```

and quintile of Index of Multiple Deprivation (IMD):
```{r fig.width=11}
ggplot(data = data_england, aes(fill = as.factor(IMD))) +
  geom_sf(colour = "white", size = 0.1) +
  labs(title = "IMD quintile") +
  theme_void() +
  scale_fill_discrete_sequential(palette = "Heat", rev = FALSE)
```

## A model with an unstructured spatial component

The relative risks (RRs) will be smoothed using the Poisson model with a log-link function, as we have seen in previous lectures and labs.
As usual in this workshop, the inference is done with `NIMBLE` called through `R`.

In particular, let each area $i$ be indexed by the integers $1, 2,...,N$. The model is as follows:
$$
\begin{eqnarray}
O_{i}  & \sim & \text{Pois}(\lambda_{i}E_{i} ) \quad i = 1, 2,...,N \\
\log(\lambda_{i}) & = & \alpha + \theta_{i}  \\
\theta_{i} & \sim & N(0,\sigma_{\theta})
\end{eqnarray}
$$
where $\sigma_{\theta}$ is a standard deviation term that controls the magnitude of $\theta_{i}$.

As in previous labs, we will write the model in `NIMBLE.`
Specify the prior of $\tau_{\theta}$ in the code below (`tau.theta`).
We can try a Gamma with parameters 1 and 0.01, which is a sensible option for a Poisson count model as it goes from 0 upwards, so can not be negative, like counts:
```{r}
unstr_code <- nimbleCode({
  # priors
  alpha ~ dnorm(0, sd = 100000) # vague prior (large variance)
  tau.theta ~ dgamma(1, 0.01) # prior for the precision hyperparameter

  # likelihood
  for (i in 1:N) {
    O[i] ~ dpois(mu[i]) # Poisson likelihood
    log(mu[i]) <- log(E[i]) + alpha + theta[i]

    theta[i] ~ dnorm(0, tau = tau.theta) # area-specific RE
    RR[i] <- exp(alpha + theta[i]) # area-specific RR
  }

  # overall RR across study region
  overallRR <- exp(alpha)
})
```

The following code subsets the data to London so the models are quicker to run.
We're going to run the models for London, then load in the samples for England (and pretend we ran for England!).
```{r eval = FALSE}
data_england <- data_england[startsWith(data_england$LTLA, "E09"), ]
ggplot(data = data_england, fill = "NA") +
  geom_sf() +
  theme_void()
```

How many spatial units are in the map?
```{r}
# Obtain the number of LTLAs
n.LTLA <- nrow(data_england)
n.LTLA
```

Create data object as required for `NIMBLE`.
```{r}
# Format the data for NIMBLE in a list
covid_data <- list(
  O = data_england$deaths # observed nb of deaths
)

covid_constants <- list(
  N = n.LTLA, # number of LTLAs
  E = data_england$expectd # expected number of deaths
)
```

What are the parameters to be initialised?
Create a list with two elements and call it `inits` (each a list) with different initial values for the parameters:
```{r}
inits <- list(
  list(alpha = 0.01, tau.theta = 10, theta = rep(0.01, times = n.LTLA)), # chain 1
  list(alpha = 0.5, tau.theta = 1, theta = rep(-0.01, times = n.LTLA)) # chain 2
)
```

Set `parameters_to_monitor` a vector to monitor `alpha`, `theta`, `tau.theta`, `overallRR` and `RR`.
```{r}
parameters_to_monitor <- c("alpha", "theta", "tau.theta", "overallRR", "RR")
```
Note that the parameters that are not set, will NOT be monitored!

Run the MCMC simulations calling Nimble from R using the function `nimbleMCMC()`.
The burn-in should be long enough to discard the initial part of the Markov chains that have not yet converged to the stationary distribution.
```{r eval=FALSE, message=FALSE, warning=FALSE}
tic <- Sys.time()
modelGS.sim <- nimbleMCMC(
  code = unstr_code,
  data = covid_data,
  constants = covid_constants,
  inits = inits,
  monitors = parameters_to_monitor,
  niter = 50000,
  nburnin = 30000,
  thin = 100, # thinning to get make the posterior samples more independent – useful for correlated models
  nchains = 2,
  setSeed = 9,
  progressBar = TRUE,
  samplesAsCodaMCMC = TRUE,
  summary = TRUE,
  WAIC = TRUE
)

toc <- Sys.time()
toc - tic # ~ 2minutes
# saveRDS(modelGS.sim, file = "NIMBLE_IDD_A1")
```

It's good practice to save samples, especially if model runtimes get long.
Let's load the samples back up.
```{r echo = FALSE}
modelGS.sim <- readRDS("NIMBLE_IDD_A1")
```

What is the summary of each estimated parameter?
```{r}
summarise_draws(modelGS.sim$samples, default_summary_measures())
```

_The Gelman-Rubin diagnostic ($\widehat{R}$)_
The Gelman-Rubin diagnostic evaluates MCMC convergence by analyzing the difference between multiple Markov chains.
The convergence is assessed by comparing the estimated between-chains and within-chain variances for each model parameter.
Large differences between these variances indicate nonconvergence.
When the scale reduction factor is high (perhaps greater than 1.1), then we should run our chains out longer to improve convergence to the stationary distribution.
```{r}
summarise_draws(modelGS.sim$samples, default_convergence_measures())
```
The $\widehat{R}$ of the overallRR is always lower than 1.1.

With more complicated models, sometimes it's nice to see if the traceplots are working too.
Remember there are two chains of samples this time, which should look like they broadly have the same distribution if they have converged.
```{r eval = TRUE, warning = FALSE}
mcmc_trace(modelGS.sim$samples, pars = c("tau.theta"))
```

We can use the functions `mcmc_acf_bar()` and `mcmc_acf()` to get the autocorrelation plots for the different chains for `tau.theta` (which should be as near to `0` as possible).
```{r warning=FALSE}
# mcmc_acf_bar(modelGS.sim$samples, pars = c("tau.theta"))
mcmc_acf(modelGS.sim$samples, regex_pars = c("tau.theta"))
```

Use the functions `mcmc_hist()` and `mcmc_dens()` to get the histogram and the density plots of the posterior `tau.theta` for chain 1.
```{r eval = TRUE, fig.width=4, fig.height=3, warning = FALSE}
mcmc_hist(modelGS.sim$samples$chain1, pars = c("tau.theta"))
# mcmc_dens(modelGS.sim$samples$chain1, pars = c("tau.theta"))
```

Use the function `mcmc_dens_overlay()` to get the density plot of the posterior `tau.theta` for both chains.
```{r eval = TRUE, fig.width=4, fig.height=3, warning = FALSE}
mcmc_dens_overlay(modelGS.sim$samples, pars = c("tau.theta"))
```

To see the WAIC, which can be used to compare to other models to evaluate how well parameterised a model was.
```{r eval=TRUE}
modelGS.sim$WAIC
```

## Map of the (globally) smoothed RRs

Let's map the smoothed RRs in R we extract the posterior median of the relative risks
```{r eval=TRUE}
RR <- modelGS.sim$summary$all.chains[str_c("RR[", seq(n.LTLA), "]"), "Median"] # posterior median
```

Let's look at a map of the smoothed RRs
```{r eval=TRUE, fig.width=5, fig.align='center'}
data_england |>
  mutate(RR = RR) |>
  ggplot(aes(fill = RR)) +
  geom_sf(colour = "white", size = 0.1) +
  labs(title = "Standardised mortality ratio") +
  theme_void() +
  scale_fill_continuous_divergingx(palette = "RdBu", mid = 1, rev = TRUE)
```

## The BYM model

The RRs will be smoothed using the Poisson model and the BYM model in the log-link function.
In particular, let each area $i$ be indexed by  the integers $1, 2,...,N$.
The model is as follows.

::: aside
A Besag, York, Mollie (BYM) spatial model considers two things:

1. Neighboring Influence: It looks at how connected areas are. If two areas are close to each other, there's a chance that a disease can move from one area to another. So, if a disease is in one area, it might "spill over" into the neighboring area.

2. Background Risk: This is like the natural risk of the disease occurring, even without any special factors. Some areas might naturally have more cases due to other factors. The model takes this into account too.
:::

$$
\begin{equation}
\begin{aligned}
\hbox{O}_i & \sim \hbox{Poisson}(E_i \lambda_i); \;\;\; i=1,...,N\\
\log \lambda_i & = \alpha + \theta_i + \phi_i\\
\theta_i &\sim \hbox{Normal}(0, \sigma^2_{\theta_i})\\
{\bf \phi} & \sim \hbox{ICAR}({\bf W}, \sigma_{\phi}^2) \,\, ,  \sum_i \phi_i  = 0 \\
\alpha & \sim \text{Uniform}(-\infty, +\infty) \\
1/\sigma_{\theta}^2 & \sim \hbox{Gamma}(0.5, 0.05) \\
1/\sigma_{\phi}^2 & \sim \hbox{Gamma}(0.5, 0.0005) \\
\end{aligned}
\end{equation}
$$

where $\sigma_{\phi}$ is a standard deviation term that controls the magnitude of $\phi_{i}$, i.e. the spatial structured term.

Let's set up the BYM model.
```{r}
BYM_code <- nimbleCode({
  # priors
  alpha ~ dflat() # vague prior (Unif(-inf, +inf))
  overallRR <- exp(alpha) # overall RR across study region

  tau.theta ~ dgamma(0.5, 0.05) # prior for the precision hyperparameter
  sigma2.theta <- 1 / tau.theta # variance of unstructured area random effects

  tau.phi ~ dgamma(0.5, 0.0005) # prior on precision of spatial area random effects
  sigma2.phi <- 1 / tau.phi # conditional variance of spatial area random effects

  # likelihood
  for (i in 1:N) {
    O[i] ~ dpois(mu[i]) # Poisson likelihood for observed counts
    log(mu[i]) <- log(E[i]) + alpha + theta[i] + phi[i]

    theta[i] ~ dnorm(0, tau = tau.theta) # area-specific RE

    SMR[i] <- exp(alpha + theta[i] + phi[i])
    resRR[i] <- exp(theta[i] + phi[i]) # area-specific residual RR
    proba.resRR[i] <- step(resRR[i] - 1) # Posterior probability
  }

  # BYM
  phi[1:N] ~ dcar_normal(adj = adj[1:L], weights = weights[1:L], num = num[1:N], tau = tau.phi, zero_mean = 1)
})
```

# Creating the adjacency matrix

To fit a BYM model, we first need to define an adjacency matrix, which describes which spatial units are actually neighbouring each other.
The equation is below, but essentially if the units neighbour, they are given `1` in the relevant adjacency matrix, otherwise the element representing whether two spatial units are neighbouring (sometimes called 'contiguous') then it is given a value of `0`.
Most of the elements in a matrix will be `0` for a real-world situation with lots of spatial units.

To run the BYM model, the adjacency matrix needs to be provided.
Recall that there are many ways of defining an adjacency matrix ${\bf W}$. Here we will use queen contiguity which is defined as
$$
\begin{equation}
w_{ij} =
\begin{cases}
1 & \text{if } j \in \partial_i  \\
0         & \text{otherwise}
\end{cases}
\end{equation}
$$

where $\partial_i$ is the set of area adjacent to $i$, and $w_{ij}$ is the $ij$ element of ${\bf W}$.

# Adjacency matrix in R

Convert the polygons to a list of neighbors using the function `poly2nb()`
```{r}
LTLA_nb <- poly2nb(pl = data_england)
LTLA_nb
```

Convert the list you defined previously to `NIMBLE` format (i.e. a list of 3 components adj, num and weights) using the function `nb2WB()` and print a summary of the object.
```{r}
nbWB_A <- nb2WB(nb = LTLA_nb)
names(nbWB_A)
```

A list of three components is created:

1. `adj` = `ID` for all the neighbors;
2. `weights` = the weight for each neighbour; and
3. num = total nb of neighbors across the study region

Let's update the constants to include the adjacency matrix information
```{r eval=TRUE}
covid_constants <- list(
  N = n.LTLA, # nb of LTLAs
  E = data_england$expectd, # expected number of deaths
  # adjacency matrix
  L = length(nbWB_A$weights), # the number of neighboring areas
  adj = nbWB_A$adj, # the elements of the neighbouring matrix
  num = nbWB_A$num,
  weights = nbWB_A$weights
)
```


Define the initial values for __all__ the unknown parameters.
```{r echo=TRUE, eval=TRUE}
# initialise the unknown parameters, 2 chains
inits <- list(
  list(
    alpha = 0.01,
    tau.theta = 10,
    tau.phi = 1,
    theta = rep(0.01, times = n.LTLA),
    phi = c(rep(0.5, times = n.LTLA))
  ),
  list(
    alpha = 0.5,
    tau.theta = 1,
    tau.phi = 0.1,
    theta = rep(0.05, times = n.LTLA),
    phi = c(rep(-0.05, times = n.LTLA))
  )
)
```

Which model parameters do you want to monitor? Set these before running `NIMBLE` Call this object `parameters_to_monitor`.
```{r echo=TRUE, eval=TRUE}
parameters_to_monitor <- c("sigma2.theta", "sigma2.phi", "overallRR", "theta", "SMR", "resRR", "proba.resRR", "alpha")
```

Let's run the model.
```{r eval=FALSE}
tic <- Sys.time()
modelBYM.sim <- nimbleMCMC(
  code = BYM_code,
  data = covid_data,
  constants = covid_constants,
  inits = inits,
  monitors = parameters_to_monitor,
  niter = 50000,
  nburnin = 10000,
  thin = 10,
  nchains = 2,
  setSeed = 9,
  progressBar = TRUE,
  samplesAsCodaMCMC = TRUE,
  summary = TRUE,
  WAIC = TRUE
)
toc <- Sys.time()
toc - tic
# saveRDS(modelBYM.sim, file = "NIMBLE_BYM_A3")
```

```{r echo=TRUE, eval=TRUE}
modelBYM.sim <- readRDS("NIMBLE_BYM_A3")
```

Retrieve WAIC and compare with previous model.
Which model performs best?
```{r echo=TRUE, eval=TRUE}
modelBYM.sim$WAIC
```

Check convergence of resRR.
```{r echo=TRUE, eval=TRUE, warning = FALSE}
mcmc_trace(modelBYM.sim$samples, pars = c("resRR[1]", "resRR[19]", "resRR[25]", "resRR[33]"))
```

Extract the residual RR (i.e. exp(V + U)) and the posterior probability that resRR is higher than 1 as.
Note that for the posterior probability, which is a vector of 0s and 1s we need to calculate the sum of 1s by the total sum, ie the mean.
```{r echo=TRUE, eval=TRUE}
RR_BYM <- tibble(
  RR_BYM = modelBYM.sim$summary$all.chains[paste0("resRR[", 1:n.LTLA, "]"), "Median"],
  pp_BYM = modelBYM.sim$summary$all.chains[paste0("proba.resRR[", 1:n.LTLA, "]"), "Mean"]
)
```

Map the smoothed residual RR (`resRR`).
```{r echo=TRUE, eval=TRUE, fig.width=10, warning=FALSE}
data_england |>
  mutate(RR_BYM = RR_BYM$RR_BYM) |>
  ggplot(aes(fill = RR_BYM)) +
  geom_sf(colour = "white", size = 0.1) +
  labs(title = "Standardised mortality ratio") +
  theme_void() +
  scale_fill_continuous_divergingx(palette = "RdBu", mid = 1, rev = TRUE)

data_england |>
  mutate(pp_BYM = RR_BYM$pp_BYM) |>
  ggplot(aes(fill = pp_BYM)) +
  geom_sf(colour = "white", size = 0.1) +
  labs(title = "Exceedence probability") +
  theme_void() +
  scale_fill_continuous_sequential(
    palette = "Sunset",
    rev = TRUE,
    name = "Posterior probability",
    limits = c(0, 1),
    breaks = c(0, 0.5, 1)
  )
```

## Closing remarks

In this lab session, we have explored how to fit spatial models using Bayesian regression in `NIMBLE`.
We looked at the two most common approaches: iid and the BYM model.

We used real data on COVID-19 deaths in England during March-July 2020 and first performed disease mapping to understand the spatial trends of COVID-19 mortality in the first stages of the pandemic. As part of this analysis, we visualised spatial data, fitted a model with an unstructured spatial random effect, demonstrated how to build a neihgborhood matrix in R and fitted the BYM model.

For more information on the final model see Konstantinoudis et al 2021 (10.1016/j.envint.2020.106316).

## Advanced: Ecological regression with the BYM model

The section below is very relevant for environmental health, but if we run out of time in the lab, it is something you should go through in your own time.

Let $\mathcal{D}$ be the observation window of England and $A_1, A_2, \dots, A_N$ a partition denoting the LTLAs in England with $\cup_{i=1}^NA_i = \mathcal{D}$ and $A_i\cap A_j$ for every $i\neq j$.
Let $O_1, O_2, \dots, O_N$ be the observed number of COVID-19 deaths occurred during March-July 2020 in England, $E_1, E_2, \dots, E_N$ is the expected number of COVID-19 deaths and $\lambda_1, \lambda_2, \dots, \lambda_N$ the standardized mortality ratio (recall $\lambda_i = \frac{O_i}{E_i}$).
A standardized mortality ratio of $1.5$ implies that the COVID-19 deaths we observed in the $i$-th area are $1.5$ times higher to what we expected.
Under the Poisson assumption we have:
$$
\begin{equation}
\begin{aligned}
\hbox{O}_i & \sim \hbox{Poisson}(E_i \lambda_i); \;\;\; i=1,...,N\\
\log \lambda_i & = \alpha +  \beta_1 X_{1i} + \beta_2 X_{2i} + \theta_i + \phi_i\\
\theta_i &\sim \hbox{Normal}(0, \sigma^2_{\theta_i})\\
{\bf \phi} & \sim \hbox{ICAR}({\bf W}, \sigma_{\phi}^2) \,\, ,  \sum_i \phi_i  = 0 \\
\alpha & \sim \text{Uniform}(-\infty, +\infty) \\
\beta_1, \beta_2 & \sim \mathcal{N}(0, 10) \\
1/\sigma_{\theta}^2 & \sim \hbox{Gamma}(0.5, 0.05) \\
1/\sigma_{\phi}^2 & \sim \hbox{Gamma}(0.5, 0.0005) \\
\end{aligned}
\end{equation}
$$
the terms $\beta_1 X_{1i} + \beta_2 X_{2i} + \sum_{j=2}^5\beta_{3j} X_{3i}$, where $X_{1i}, X_{2i}, X_{3i}$ are the ICU beds, NO$_2$ and IMD in the $i$-th LTLA, $\beta_1, \beta_2, \sum_{j=2}^5\beta_{3j}$ the corresponding effects and $exp(\beta_1), exp(\beta_2)$ the relative risk of ICU beds or NO$_2$ for every unit increase and of the ICU beds or NO$_2$.
For instance $exp(\beta_2) = 1.8$ means that for every unit increase of long term exposure to $NO_2$, the risk (read standardized mortality ratio) of COVID-19 deaths cancer increases by $80\%$. $exp(\beta_{32}), \beta_{33}, \beta_{34}, \beta_{35}$ are the relative risks compared to the baseline IMD category, ie the most deprived areas.
An $exp(\beta_{35}) = 0.5$ means that the risk of COVID-19 deaths in most affluent areas decreases by $50%$ compared to the most deprived areas.$\tau_{\theta}$ is a precision (reciprocal of the variance) term that controls the magnitude of $\theta_{i}$.
We will first write the model in `NIMBLE`.

```{r eval=TRUE, echo = TRUE}
BYMecoCode <- nimbleCode({
  # priors
  alpha ~ dflat() # vague prior (Unif(-inf, +inf))
  overallRR <- exp(alpha) # overall RR across study region

  tau.theta ~ dgamma(0.5, 0.05) # prior for the precision hyperparameter
  sigma2.theta <- 1 / tau.theta # variance of unstructured area random effects

  tau.phi ~ dgamma(0.5, 0.0005) # prior on precison of spatial area random effects
  sigma2.phi <- 1 / tau.phi # conditional variance of spatial area random effects

  for (j in 1:K) {
    beta[j] ~ dnorm(0, tau = 1)
    RR.beta[j] <- exp(beta[j])
  }

  RR.beta1_1NO2 <- exp(beta[1] / sd.no2) # get per 1 unit increase in the airpollution (scale back)

  # likelihood
  for (i in 1:N) {
    O[i] ~ dpois(mu[i]) # Poisson likelihood for observed counts
    log(mu[i]) <- log(E[i]) + alpha + theta[i] + phi[i] + inprod(beta[], X[i, ])
    # the inprod is equivalent to beta[1]*X1[i] + beta[2]*X2[i] + beta[3]*X32[i] + beta[4]*X33[i] + beta[5]*X34[i] + beta[6]*X35[i]

    SMR[i] <- alpha + theta[i] + phi[i] + inprod(beta[], X[i, ])
    theta[i] ~ dnorm(0, tau = tau.theta) # area-specific RE
    resRR[i] <- exp(theta[i] + phi[i]) # area-specific residual RR
    proba.resRR[i] <- step(resRR[i] - 1) # Posterior probability
  }

  # BYM prior
  phi[1:N] ~ dcar_normal(adj = adj[1:L], weights = weights[1:L], num = num[1:N], tau = tau.phi, zero_mean = 1)
})
```

Create data object as required for `NIMBLE`.
```{r eval=TRUE}
n.LTLA <- dim(data_england)[1]

# create the dummy columns for deprivation
data_england <- data_england |>
  mutate(IMD = as_factor(IMD)) |>
  mutate(as.data.frame(model.matrix(~ 0 + IMD, data = pick(everything()))))

# matrix of covariates
Xmat <- cbind(
  scale(data_england$NO2)[, 1],
  scale(data_england$TtlICUB)[, 1],
  data_england$IMD2,
  data_england$IMD3,
  data_england$IMD4,
  data_england$IMD5
)

# Format the data for NIMBLE in a list
covid_data <- list(
  O = data_england$deaths, # observed nb of deaths

  # covariates
  X = Xmat
)

# number of total covariates
K <- ncol(Xmat)

covid_constants <- list(
  N = n.LTLA, # nb of LTLAs
  K = K, # number of covariates

  # adjacency matrix
  L = length(nbWB_A$weights), # the number of neighboring areas
  E = data_england$expectd, # expected number of deaths
  adj = nbWB_A$adj, # the elements of the neighbouring matrix
  num = nbWB_A$num,
  weights = nbWB_A$weights
)
```

Create the initial values for ALL the unknown parameters:
```{r}
# initialise the unknown parameters, 2 chains
inits <- list(
  list(
    alpha = 0.01,
    beta = rep(0, K),
    tau.theta = 10,
    tau.phi = 1,
    theta = rep(0.01, times = n.LTLA),
    phi = c(rep(0.5, times = n.LTLA))
  ),
  list(
    alpha = 0.5,
    beta = rep(-1, K),
    tau.theta = 1,
    tau.phi = 0.1,
    theta = rep(0.05, times = n.LTLA),
    phi = c(rep(-0.05, times = n.LTLA))
  )
)
```

Which model parameters do you want to monitor? Set these before running `NIMBLE`. Call this object `parameters_to_monitor`.
```{r}
parameters_to_monitor <- c("sigma2.theta", "sigma2.phi", "overallRR", "theta", "beta", "RR.beta", "resRR", "proba.resRR", "alpha", "RR.beta1_1NO2")
```

Run the MCMC simulations using the function `nimbleMCMC()`.
If everything is specified reasonably, this needs approximately 5 minutes.
```{r echo=TRUE, eval=FALSE}
tic <- Sys.time()
modelBYMeco.sim <- nimbleMCMC(
  code = BYMecoCode,
  data = covid_data,
  constants = covid_constants,
  inits = inits,
  monitors = parameters_to_monitor,
  niter = 50000,
  nburnin = 30000,
  thin = 10,
  nchains = 2,
  setSeed = 9,
  progressBar = TRUE,
  samplesAsCodaMCMC = TRUE,
  summary = TRUE,
  WAIC = TRUE
)
toc <- Sys.time()
toc - tic
# saveRDS(modelBYMeco.sim, file = "NIMBLE_BYM_A4")
```

```{r}
modelBYMeco.sim <- readRDS("NIMBLE_BYM_A4")
```

Retrieve WAIC and compare with previous model.
Which model performs best?
```{r warning=FALSE}
modelBYMeco.sim$WAIC
```

Check the convergence of the intercept and covariates NO$_2$ and ICU beds. What do you observe?
```{r echo=TRUE, eval=TRUE, warning = FALSE, fig.height=7, fig.width=10}
mcmc_trace(modelBYMeco.sim$samples, pars = c("alpha", paste0("beta[", 1:K, "]")))
```

Retrieve summary statistics for the two covariates and interpret (it is easier to interpret on the relative scale):
```{r echo=TRUE, eval=TRUE}
modelBYMeco.sim$summary$all.chains[paste0("RR.beta[", 1:K, "]"), ]
```

We can get a nice credible intervals plot as well:
```{r warning=FALSE}
modelBYMeco.sim$summary$all.chains[paste0("RR.beta[", 1:K, "]"), ] |>
  as_tibble() |>
  select(Median, `95%CI_low`, `95%CI_upp`) |>
  mutate(
    covariate = factor(c("NO2", "ICU", paste0("IMD", 2:5)), levels = c("NO2", "ICU", paste0("IMD", 2:5)))
  ) -> cov.eff

cov.eff |> head()

cov.eff |>
  ggplot(aes(x = covariate, y = Median)) +
  geom_point() +
  geom_errorbar(aes(x = covariate, ymin = `95%CI_low`, ymax = `95%CI_upp`), width = 0.2) +
  ylim(c(0.5, 1.5)) +
  geom_hline(yintercept = 1, lty = 2, col = "red")
```

The effect by unit increase in the long term exposure to NO$_2$ is `RR.beta1_1NO2`:
```{r}
# as relative risk per 1 unit increase in the long term NO2 exposure
modelBYMeco.sim$summary$all.chains[paste0("RR.beta1_1NO2"), c("Median", "95%CI_low", "95%CI_upp")]

# as percentage increase in mortality for 1 unit increase in the long term NO2 exposure
(modelBYMeco.sim$summary$all.chains[paste0("RR.beta1_1NO2"), c("Median", "95%CI_low", "95%CI_upp")] - 1) * 100
```

Compare the BYM-related spatial field before and after adjusting for covariates:
```{r fig.width=10, warning=FALSE}
data_england |>
  mutate(RR_BYM = RR_BYM$RR_BYM) |>
  ggplot(aes(fill = RR_BYM)) +
  geom_sf(colour = "white", size = 0.1) +
  labs(title = "Posterior median RR (unadj)") +
  theme_void() +
  scale_fill_continuous_divergingx(palette = "RdBu", mid = 1, rev = TRUE)

data_england |>
  mutate(RR_BYM_ECO = modelBYMeco.sim$summary$all.chains[paste0("resRR[", 1:n.LTLA, "]"), "Median"]) |>
  ggplot(aes(fill = RR_BYM_ECO)) +
  geom_sf(colour = "white", size = 0.1) +
  labs(title = "Posterior median RR (adj)") +
  theme_void() +
  scale_fill_continuous_divergingx(palette = "RdBu", mid = 1, rev = TRUE)
```

## Further closing remarks for Advanced section
If you got through the advanced portion, we examined the effect of long term exposure to NO$_2$ on COVID-19 mortality.
We fitted a BYM model to account for unknown spatial confounding but in addition we accounted for total number of ICU beds and deprivation per LTLA.
We reported evidence of an increased COVID-19 mortality for increasing levels of NO$_2$.