Julian Faraway 2023-06-30
See the introduction for an overview.
This example is discussed in more detail in my book Extending the Linear Model with R
Required libraries:
library(faraway)
library(ggplot2)
library(lme4)
library(pbkrtest)
library(RLRsim)
library(INLA)
library(knitr)
library(cmdstanr)
register_knitr_engine(override = FALSE)
library(brms)
library(mgcv)
In an agricultural field trial, the objective was to determine the effects of two crop varieties and four different irrigation methods. Eight fields were available, but only one type of irrigation may be applied to each field. The fields may be divided into two parts with a different variety planted in each half. The whole plot factor is the method of irrigation, which should be randomly assigned to the fields. Within each field, the variety is randomly assigned.
Load in and plot the data:
data(irrigation, package="faraway")
summary(irrigation)
field irrigation variety yield
f1 :2 i1:4 v1:8 Min. :34.8
f2 :2 i2:4 v2:8 1st Qu.:37.6
f3 :2 i3:4 Median :40.1
f4 :2 i4:4 Mean :40.2
f5 :2 3rd Qu.:42.7
f6 :2 Max. :47.6
(Other):4
ggplot(irrigation, aes(y=yield, x=field, shape=variety, color=irrigation)) + geom_point()
The irrigation and variety are fixed effects, but the field is a random
effect. We must also consider the interaction between field and variety,
which is necessarily also a random effect because one of the two
components is random. The fullest model that we might consider is:
$$y_{ijk} = \mu + i_i + v_j + (iv){ij} + f_k + (vf){jk} + \epsilon_{ijk}$$
where $\mu, i_i, v_j, (iv){ij}$ are fixed effects; the rest are random
having variances $\sigma^2_f$, $\sigma^2{vf}$ and
We fit this model with:
lmod <- lme4::lmer(yield ~ irrigation * variety + (1|field), irrigation)
faraway::sumary(lmod)
Fixed Effects:
coef.est coef.se
(Intercept) 38.50 3.03
irrigationi2 1.20 4.28
irrigationi3 0.70 4.28
irrigationi4 3.50 4.28
varietyv2 0.60 1.45
irrigationi2:varietyv2 -0.40 2.05
irrigationi3:varietyv2 -0.20 2.05
irrigationi4:varietyv2 1.20 2.05
Random Effects:
Groups Name Std.Dev.
field (Intercept) 4.02
Residual 1.45
---
number of obs: 16, groups: field, 8
AIC = 65.4, DIC = 91.8
deviance = 68.6
The fixed effects don’t look very significant. We could use a parametric
bootstrap to test this but it’s less work to use the pbkrtest
package
which implements the Kenward-Roger approximation. First test the
interaction:
lmoda <- lmer(yield ~ irrigation + variety + (1|field),data=irrigation)
faraway::sumary(lmoda)
Fixed Effects:
coef.est coef.se
(Intercept) 38.43 2.95
irrigationi2 1.00 4.15
irrigationi3 0.60 4.15
irrigationi4 4.10 4.15
varietyv2 0.75 0.60
Random Effects:
Groups Name Std.Dev.
field (Intercept) 4.07
Residual 1.19
---
number of obs: 16, groups: field, 8
AIC = 68.8, DIC = 85.1
deviance = 70.0
pbkrtest::KRmodcomp(lmod, lmoda)
large : yield ~ irrigation * variety + (1 | field)
small : yield ~ irrigation + variety + (1 | field)
stat ndf ddf F.scaling p.value
Ftest 0.25 3.00 4.00 1 0.86
We can drop the interaction. Now test for a variety effect:
lmodi <- lmer(yield ~ irrigation + (1|field), irrigation)
KRmodcomp(lmoda, lmodi)
large : yield ~ irrigation + variety + (1 | field)
small : yield ~ irrigation + (1 | field)
stat ndf ddf F.scaling p.value
Ftest 1.58 1.00 7.00 1 0.25
The variety can go also. Now check the irrigation method.
lmodv <- lmer(yield ~ variety + (1|field), irrigation)
KRmodcomp(lmoda, lmodv)
large : yield ~ irrigation + variety + (1 | field)
small : yield ~ variety + (1 | field)
stat ndf ddf F.scaling p.value
Ftest 0.39 3.00 4.00 1 0.77
This can go also. As a final check, lets compare the null model with no fixed effects to the full model.
lmodn <- lmer(yield ~ 1 + (1|field), irrigation)
KRmodcomp(lmod, lmodn)
large : yield ~ irrigation * variety + (1 | field)
small : yield ~ 1 + (1 | field)
stat ndf ddf F.scaling p.value
Ftest 0.38 7.00 4.48 0.903 0.88
This confirms the lack of statistical significance for the variety and irrigation factors.
We can check the significance of the random effect (field) term with:
RLRsim::exactRLRT(lmod)
simulated finite sample distribution of RLRT.
(p-value based on 10000 simulated values)
data:
RLRT = 6.11, p-value = 0.0097
We can see that there is a significant variation among the fields.
Integrated nested Laplace approximation is a method of Bayesian computation which uses approximation rather than simulation. More can be found on this topic in Bayesian Regression Modeling with INLA and the chapter on GLMMs
Use the most recent computational methodology:
inla.setOption(inla.mode="experimental")
inla.setOption("short.summary",TRUE)
Try the default INLA fit
formula <- yield ~ irrigation + variety +f(field, model="iid")
result <- inla(formula, family="gaussian", data=irrigation)
summary(result)
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 38.465 2.667 33.144 38.459 43.825 38.461 0
irrigationi2 0.954 3.746 -6.577 0.961 8.439 0.958 0
irrigationi3 0.556 3.746 -6.972 0.563 8.043 0.561 0
irrigationi4 4.031 3.746 -3.511 4.042 11.505 4.038 0
varietyv2 0.750 0.596 -0.439 0.750 1.938 0.750 0
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for the Gaussian observations 0.919 0.446 0.309 0.834 2.020 0.729
Precision for field 0.102 0.065 0.024 0.087 0.268 0.090
is computed
Default looks more plausible than one way and RBD examples.
Compute the transforms to an SD scale for the field and error. Make a table of summary statistics for the posteriors:
sigmaalpha <- inla.tmarginal(function(x) 1/sqrt(exp(x)),result$internal.marginals.hyperpar[[2]])
sigmaepsilon <- inla.tmarginal(function(x) 1/sqrt(exp(x)),result$internal.marginals.hyperpar[[1]])
restab=sapply(result$marginals.fixed, function(x) inla.zmarginal(x,silent=TRUE))
restab=cbind(restab, inla.zmarginal(sigmaalpha,silent=TRUE))
restab=cbind(restab, inla.zmarginal(sigmaepsilon,silent=TRUE))
colnames(restab) = c("mu","ir2","ir3","ir4","v2","alpha","epsilon")
data.frame(restab)
mu ir2 ir3 ir4 v2 alpha epsilon
mean 38.465 0.95396 0.55684 4.0316 0.74975 3.5932 1.1344
sd 2.6652 3.7441 3.7441 3.7442 0.59506 1.1424 0.27803
quant0.025 33.146 -6.5727 -6.9683 -3.5067 -0.43866 1.9361 0.70564
quant0.25 36.833 -1.3234 -1.7208 1.7565 0.37923 2.7758 0.93483
quant0.5 38.452 0.95148 0.5539 4.0327 0.7483 3.3894 1.0926
quant0.75 40.076 3.2213 2.8238 6.3013 1.1174 4.1933 1.2903
quant0.975 43.81 8.4184 8.0227 11.485 1.9351 6.3789 1.7908
Also construct a plot the SD posteriors:
ddf <- data.frame(rbind(sigmaalpha,sigmaepsilon),errterm=gl(2,nrow(sigmaalpha),labels = c("alpha","epsilon")))
ggplot(ddf, aes(x,y, linetype=errterm))+geom_line()+xlab("yield")+ylab("density")+xlim(0,10)
Posteriors look OK.
Now try more informative gamma priors for the precisions. Define it so
the mean value of gamma prior is set to the inverse of the variance of
the residuals of the fixed-effects only model. We expect the two error
variances to be lower than this variance so this is an overestimate. The
variance of the gamma prior (for the precision) is controlled by the
apar
shape parameter.
apar <- 0.5
lmod <- lm(yield ~ irrigation+variety, data=irrigation)
bpar <- apar*var(residuals(lmod))
lgprior <- list(prec = list(prior="loggamma", param = c(apar,bpar)))
formula = yield ~ irrigation+variety+f(field, model="iid", hyper = lgprior)
result <- inla(formula, family="gaussian", data=irrigation)
summary(result)
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 38.483 3.236 32.016 38.473 45.018 38.477 0
irrigationi2 0.932 4.556 -8.264 0.944 10.050 0.940 0
irrigationi3 0.536 4.556 -8.657 0.547 9.657 0.543 0
irrigationi4 3.998 4.556 -5.219 4.016 13.096 4.010 0
varietyv2 0.750 0.582 -0.410 0.750 1.910 0.750 0
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for the Gaussian observations 0.925 0.443 0.313 0.841 2.013 0.738
Precision for field 0.071 0.049 0.015 0.059 0.197 0.063
is computed
Compute the summaries as before:
sigmaalpha <- inla.tmarginal(function(x) 1/sqrt(exp(x)),result$internal.marginals.hyperpar[[2]])
sigmaepsilon <- inla.tmarginal(function(x) 1/sqrt(exp(x)),result$internal.marginals.hyperpar[[1]])
restab=sapply(result$marginals.fixed, function(x) inla.zmarginal(x,silent=TRUE))
restab=cbind(restab, inla.zmarginal(sigmaalpha,silent=TRUE))
restab=cbind(restab, inla.zmarginal(sigmaepsilon,silent=TRUE))
colnames(restab) = c("mu","ir2","ir3","ir4","v2","alpha","epsilon")
data.frame(restab)
mu ir2 ir3 ir4 v2 alpha epsilon
mean 38.484 0.93277 0.5371 3.9992 0.74975 4.3712 1.129
sd 3.2352 4.5541 4.554 4.5544 0.58142 1.4895 0.27434
quant0.025 32.021 -8.2565 -8.6494 -5.2117 -0.40984 2.2602 0.70672
quant0.25 36.538 -1.7856 -2.1818 1.2849 0.38505 3.3082 0.93208
quant0.5 38.465 0.93267 0.53619 4.0054 0.74835 4.0901 1.0874
quant0.75 40.4 3.6426 3.2463 6.7132 1.1116 5.1335 1.2826
quant0.975 45.001 10.024 9.6314 13.071 1.9063 8.0433 1.7774
Make the plots:
ddf <- data.frame(rbind(sigmaalpha,sigmaepsilon),errterm=gl(2,nrow(sigmaalpha),labels = c("alpha","epsilon")))
ggplot(ddf, aes(x,y, linetype=errterm))+geom_line()+xlab("yield")+ylab("density")+xlim(0,10)
Posteriors look OK.
In Simpson et al (2015), penalized complexity priors are proposed. This requires that we specify a scaling for the SDs of the random effects. We use the SD of the residuals of the fixed effects only model (what might be called the base model in the paper) to provide this scaling.
lmod <- lm(yield ~ irrigation+variety, irrigation)
sdres <- sd(residuals(lmod))
pcprior <- list(prec = list(prior="pc.prec", param = c(3*sdres,0.01)))
formula <- yield ~ irrigation+variety+f(field, model="iid", hyper = pcprior)
result <- inla(formula, family="gaussian", data=irrigation)
summary(result)
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 38.473 2.932 32.621 38.467 44.365 38.468 0
irrigationi2 0.944 4.125 -7.344 0.951 9.185 0.950 0
irrigationi3 0.547 4.125 -7.739 0.554 8.790 0.553 0
irrigationi4 4.017 4.125 -4.283 4.028 12.247 4.025 0
varietyv2 0.750 0.587 -0.422 0.750 1.921 0.750 0
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for the Gaussian observations 0.925 0.448 0.315 0.839 2.033 0.732
Precision for field 0.081 0.052 0.020 0.069 0.217 0.071
is computed
Compute the summaries as before:
sigmaalpha <- inla.tmarginal(function(x) 1/sqrt(exp(x)),result$internal.marginals.hyperpar[[2]])
sigmaepsilon <- inla.tmarginal(function(x) 1/sqrt(exp(x)),result$internal.marginals.hyperpar[[1]])
restab=sapply(result$marginals.fixed, function(x) inla.zmarginal(x,silent=TRUE))
restab=cbind(restab, inla.zmarginal(sigmaalpha,silent=TRUE))
restab=cbind(restab, inla.zmarginal(sigmaepsilon,silent=TRUE))
colnames(restab) = c("mu","ir2","ir3","ir4","v2","alpha","epsilon")
data.frame(restab)
mu ir2 ir3 ir4 v2 alpha epsilon
mean 38.473 0.94403 0.54752 4.017 0.74974 4.015 1.1292
sd 2.9295 4.1209 4.1209 4.121 0.58691 1.2581 0.27401
quant0.025 32.622 -7.3409 -7.7359 -4.28 -0.42121 2.1544 0.7034
quant0.25 36.657 -1.5958 -1.9926 1.4794 0.38208 3.113 0.93254
quant0.5 38.459 0.94141 0.5444 4.0182 0.74833 3.8038 1.0893
quant0.75 40.267 3.4726 3.0758 6.5479 1.1146 4.6895 1.284
quant0.975 44.349 9.1628 8.7678 12.224 1.9176 7.0523 1.7733
Make the plots:
ddf <- data.frame(rbind(sigmaalpha,sigmaepsilon),errterm=gl(2,nrow(sigmaalpha),labels = c("alpha","epsilon")))
ggplot(ddf, aes(x,y, linetype=errterm))+geom_line()+xlab("yield")+ylab("density")+xlim(0,10)
Posteriors look OK. Not much difference between the three priors tried here.
STAN performs Bayesian inference using MCMC.
I use cmdstanr
to access Stan from R.
You see below the Stan code to fit our model. Rmarkdown allows the use of Stan chunks (elsewhere I have R chunks). The chunk header looks like this.
STAN chunk will be compiled to ‘mod’. Chunk header is:
cmdstan, output.var="mod", override = FALSE
data {
int<lower=0> N;
array[N] int<lower=1,upper=8> field;
array[N] int<lower=1,upper=4> irrigation;
array[N] int<lower=1,upper=2> variety;
array[N] real y;
}
transformed data { // need to manually create dummy variables
vector[N] irmeth2;
vector[N] irmeth3;
vector[N] irmeth4;
vector[N] var2;
for (i in 1:N) {
irmeth2[i] = irrigation[i] == 2;
irmeth3[i] = irrigation[i] == 3;
irmeth4[i] = irrigation[i] == 4;
var2[i] = variety[i] == 2;
}
}
parameters {
vector[8] eta;
real mu;
real ir2;
real ir3;
real ir4;
real va2;
real<lower=0> sigmaf;
real<lower=0> sigmay;
}
transformed parameters {
vector[8] fld;
vector[N] yhat;
fld = sigmaf * eta;
for (i in 1:N)
yhat[i] = mu+ir2*irmeth2[i]+ir3*irmeth3[i]+ir4*irmeth4[i]+va2*var2[i]+fld[field[i]];
}
model {
eta ~ normal(0, 1);
y ~ normal(yhat, sigmay);
}
We have used uninformative priors for the fixed effects and the two variances. Prepare data in a format consistent with the command file. Needs to be a list.
irridat <- with(irrigation,list(N=length(yield), y=yield, field=as.numeric(field), irrigation=as.numeric(irrigation), variety=as.numeric(variety)))
Do the MCMC sampling:
fit <- mod$sample(
data = irridat,
seed = 123,
chains = 4,
parallel_chains = 4,
refresh = 500 # print update every 500 iters
)
Running MCMC with 4 parallel chains...
Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
Chain 3 Iteration: 1 / 2000 [ 0%] (Warmup)
Chain 3 Iteration: 500 / 2000 [ 25%] (Warmup)
Chain 4 Iteration: 1 / 2000 [ 0%] (Warmup)
Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
Chain 3 Iteration: 1000 / 2000 [ 50%] (Warmup)
Chain 3 Iteration: 1001 / 2000 [ 50%] (Sampling)
Chain 4 Iteration: 500 / 2000 [ 25%] (Warmup)
Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
Chain 4 Iteration: 1000 / 2000 [ 50%] (Warmup)
Chain 4 Iteration: 1001 / 2000 [ 50%] (Sampling)
Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
Chain 3 Iteration: 1500 / 2000 [ 75%] (Sampling)
Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
Chain 3 Iteration: 2000 / 2000 [100%] (Sampling)
Chain 4 Iteration: 1500 / 2000 [ 75%] (Sampling)
Chain 1 finished in 0.6 seconds.
Chain 2 finished in 0.6 seconds.
Chain 3 finished in 0.5 seconds.
Chain 4 Iteration: 2000 / 2000 [100%] (Sampling)
Chain 4 finished in 0.7 seconds.
All 4 chains finished successfully.
Mean chain execution time: 0.6 seconds.
Total execution time: 0.8 seconds.
Extract the draws into a convenient dataframe format:
draws_df <- fit$draws(format = "df")
For the field SD:
ggplot(draws_df,
aes(x=.iteration,y=sigmaf,color=factor(.chain))) + geom_line() +
labs(color = 'Chain', x="Iteration")
which also looks reasonable.
Examine the output for the parameters we are mostly interested in:
fit$summary(c("mu","ir2","ir3","ir4","va2","sigmaf","sigmay","fld"))
# A tibble: 15 × 10
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
<chr> <num> <num> <num> <num> <num> <num> <num> <num> <num>
1 mu 38.2 38.3 4.94 3.71 30.2 45.6 1.00 1012. 914.
2 ir2 1.38 1.32 7.19 5.11 -9.50 13.0 1.00 1076. 1116.
3 ir3 0.973 0.878 6.91 5.25 -9.14 11.7 1.01 1107. 1011.
4 ir4 4.28 4.05 7.41 5.14 -6.21 15.5 1.00 1166. 1237.
5 va2 0.726 0.732 0.821 0.700 -0.600 1.99 1.00 3331. 1687.
6 sigmaf 6.12 5.13 3.59 2.39 2.56 13.2 1.01 570. 1108.
7 sigmay 1.51 1.38 0.562 0.406 0.904 2.53 1.00 783. 967.
8 fld[1] -1.81 -1.86 4.93 3.72 -9.24 6.15 1.00 1033. 901.
9 fld[2] -2.38 -2.42 4.90 3.70 -9.77 5.07 1.00 1937. 2032.
10 fld[3] -3.73 -3.63 4.84 3.57 -11.1 3.61 1.00 2197. 1434.
11 fld[4] -2.96 -2.89 5.36 3.77 -10.9 4.85 1.00 2178. 1602.
12 fld[5] 2.25 2.11 4.92 3.81 -4.95 10.2 1.01 990. 912.
13 fld[6] 2.02 1.96 4.94 3.82 -5.36 9.79 1.00 1960. 1901.
14 fld[7] 3.42 3.37 4.87 3.68 -3.74 11.0 1.00 2140. 1617.
15 fld[8] 3.01 2.96 5.41 3.79 -4.94 11.2 1.00 2224. 1632.
We see the posterior mean, median and SD, MAD of the samples. We see
some quantiles from which we could construct a 95% credible interval
(for example). The effective sample sizes for the primary parameters is
good enough for most purposes. The
Notice that the posterior mean for field SD is substantially larger than seen in the mixed effect model or the previous INLA models.
Plot the posteriors for the variance components
sdf = stack(draws_df[,startsWith(colnames(draws_df),"sigma")])
colnames(sdf) = c("yield","sigma")
levels(sdf$sigma) = c("field","epsilon")
ggplot(sdf, aes(x=yield,color=sigma)) + geom_density() +xlim(0,20)
We see that the error SD can be localized much more than the field SD. We can also look at the field effects:
sdf = stack(draws_df[,startsWith(colnames(draws_df),"fld")])
colnames(sdf) = c("yield","fld")
levels(sdf$fld) = 1:8
ggplot(sdf, aes(x=yield,color=fld)) + geom_density() + xlim(-25,25)
We are looking at the differences from the overall mean. We see that all eight field distributions clearly overlap zero. There is a distinction between the first four and the second four fields. We can also look at the “fixed” effects:
sdf = stack(draws_df[,c("ir2","ir3","ir4","va2")])
colnames(sdf) = c("yield","fixed")
levels(sdf$fixed) = c("ir2","ir3","ir4","va2")
ggplot(sdf, aes(x=yield,color=fixed)) + geom_density() + xlim(-15,15)
We are looking at the differences from the reference level. We see that all four distributions clearly overlap zero although we are able to locate the difference between the varieties more precisely than the difference between the fields.
BRMS stands for Bayesian Regression Models with STAN. It provides a convenient wrapper to STAN functionality.
Fitting the model is very similar to lmer
as seen above:
suppressMessages(bmod <- brm(yield ~ irrigation + variety + (1|field),
irrigation, iter=10000, cores=4, backend = "cmdstanr"))
We get some warnings but not as severe as seen with our STAN fit above. We can obtain some posterior densities and diagnostics with:
plot(bmod, variable = "^s", regex=TRUE)
We have chosen only the random effect hyperparameters since this is where problems will appear first. Looks OK.
We can look at the STAN code that brms
used with:
stancode(bmod)
// generated with brms 2.19.0
functions {
}
data {
int<lower=1> N; // total number of observations
vector[N] Y; // response variable
int<lower=1> K; // number of population-level effects
matrix[N, K] X; // population-level design matrix
// data for group-level effects of ID 1
int<lower=1> N_1; // number of grouping levels
int<lower=1> M_1; // number of coefficients per level
array[N] int<lower=1> J_1; // grouping indicator per observation
// group-level predictor values
vector[N] Z_1_1;
int prior_only; // should the likelihood be ignored?
}
transformed data {
int Kc = K - 1;
matrix[N, Kc] Xc; // centered version of X without an intercept
vector[Kc] means_X; // column means of X before centering
for (i in 2 : K) {
means_X[i - 1] = mean(X[ : , i]);
Xc[ : , i - 1] = X[ : , i] - means_X[i - 1];
}
}
parameters {
vector[Kc] b; // population-level effects
real Intercept; // temporary intercept for centered predictors
real<lower=0> sigma; // dispersion parameter
vector<lower=0>[M_1] sd_1; // group-level standard deviations
array[M_1] vector[N_1] z_1; // standardized group-level effects
}
transformed parameters {
vector[N_1] r_1_1; // actual group-level effects
real lprior = 0; // prior contributions to the log posterior
r_1_1 = sd_1[1] * z_1[1];
lprior += student_t_lpdf(Intercept | 3, 40.1, 3.9);
lprior += student_t_lpdf(sigma | 3, 0, 3.9)
- 1 * student_t_lccdf(0 | 3, 0, 3.9);
lprior += student_t_lpdf(sd_1 | 3, 0, 3.9)
- 1 * student_t_lccdf(0 | 3, 0, 3.9);
}
model {
// likelihood including constants
if (!prior_only) {
// initialize linear predictor term
vector[N] mu = rep_vector(0.0, N);
mu += Intercept;
for (n in 1 : N) {
// add more terms to the linear predictor
mu[n] += r_1_1[J_1[n]] * Z_1_1[n];
}
target += normal_id_glm_lpdf(Y | Xc, mu, b, sigma);
}
// priors including constants
target += lprior;
target += std_normal_lpdf(z_1[1]);
}
generated quantities {
// actual population-level intercept
real b_Intercept = Intercept - dot_product(means_X, b);
}
We see that brms
is using student t distributions with 3 degrees of
freedom for the priors. For the two error SDs, this will be truncated at
zero to form half-t distributions. You can get a more explicit
description of the priors with prior_summary(bmod)
. These are
qualitatively similar to the half-normal and the PC prior used in the
INLA fit.
We examine the fit:
summary(bmod)
Family: gaussian
Links: mu = identity; sigma = identity
Formula: yield ~ irrigation + variety + (1 | field)
Data: irrigation (Number of observations: 16)
Draws: 4 chains, each with iter = 10000; warmup = 5000; thin = 1;
total post-warmup draws = 20000
Group-Level Effects:
~field (Number of levels: 8)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 4.40 1.65 2.13 8.56 1.00 6645 8987
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 38.49 3.34 31.80 45.21 1.00 8915 8496
irrigationi2 0.81 4.77 -9.00 10.23 1.00 8816 8892
irrigationi3 0.48 4.84 -9.33 9.99 1.00 9288 9576
irrigationi4 3.98 4.82 -5.92 13.73 1.00 8675 8640
varietyv2 0.75 0.78 -0.84 2.31 1.00 16645 9466
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 1.47 0.51 0.83 2.76 1.00 5156 6305
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
The posterior mean for the field SD is more comparable to the mixed model and INLA values seen earlier and smaller than the STAN fit. This can be ascribed to the more informative prior used for the BRMS fit.
It is possible to fit some GLMMs within the GAM framework of the mgcv
package. An explanation of this can be found in this
blog
The field
term must be a factor for this to work:
gmod = gam(yield ~ irrigation + variety + s(field,bs="re"),
data=irrigation, method="REML")
and look at the summary output:
summary(gmod)
Family: gaussian
Link function: identity
Formula:
yield ~ irrigation + variety + s(field, bs = "re")
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 38.425 2.952 13.02 3e-06
irrigationi2 1.000 4.154 0.24 0.82
irrigationi3 0.600 4.154 0.14 0.89
irrigationi4 4.100 4.154 0.99 0.36
varietyv2 0.750 0.597 1.26 0.25
Approximate significance of smooth terms:
edf Ref.df F p-value
s(field) 3.83 4 23.2 0.00034
R-sq.(adj) = 0.888 Deviance explained = 94.6%
-REML = 27.398 Scale est. = 1.4257 n = 16
We get the fixed effect estimates. We also get a test on the random effect (as described in this article. The hypothesis of no variation between the fields is rejected.
We can get an estimate of the operator and error SD:
gam.vcomp(gmod)
Standard deviations and 0.95 confidence intervals:
std.dev lower upper
s(field) 4.067 1.97338 8.3820
scale 1.194 0.70717 2.0161
Rank: 2/2
which is the same as the REML estimate from lmer
earlier.
The random effect estimates for the fields can be found with:
coef(gmod)
(Intercept) irrigationi2 irrigationi3 irrigationi4 varietyv2 s(field).1 s(field).2 s(field).3 s(field).4
38.4250 1.0000 0.6000 4.1000 0.7500 -2.0612 -2.2529 -3.6430 -3.0199
s(field).5 s(field).6 s(field).7 s(field).8
2.0612 2.2529 3.6430 3.0199
In Wood (2019), a simplified
version of INLA is proposed. The first construct the GAM model without
fitting and then use the ginla()
function to perform the computation.
gmod = gam(yield ~ irrigation + variety + s(field,bs="re"),
data=irrigation, fit = FALSE)
gimod = ginla(gmod)
We get the posterior density for the intercept as:
plot(gimod$beta[1,],gimod$density[1,],type="l",xlab="yield",ylab="density")
and for the treatment effects as:
xmat = t(gimod$beta[2:5,])
ymat = t(gimod$density[2:5,])
matplot(xmat, ymat,type="l",xlab="yield",ylab="density")
legend("right",c("i2","i3","i4","v2"),col=1:4,lty=1:4)
xmat = t(gimod$beta[6:13,])
ymat = t(gimod$density[6:13,])
matplot(xmat, ymat,type="l",xlab="yield",ylab="density")
legend("right",paste0("field",1:8),col=1:8,lty=1:8)
It is not straightforward to obtain the posterior densities of the hyperparameters.
See the Discussion of the single random effect
model for general comments. Given that the fixed
effects are not significant here, this example is not so different from
the single random effect example. This provides an illustration of why
we need to pay attention to the priors. In the pulp
example, the
default priors resulted in unbelievable results from INLA and we were
prompted to consider alternatives. In this example, the default INLA
priors produce output that looked passable and, if we were feeling lazy,
we might have skipped a look at the alternatives. In this case, they do
not make much difference. Contrast this with the default STAN priors
used - the output looks reasonable but the answers are somewhat
different. Had we not been trying these other analyses, we would not be
aware of this. The minimal analyst might have stopped there. But BRMS
uses more informative priors and produces results closer to the other
methods.
STAN/BRMS put more weight on low values of the random effects SDs
whereas the INLA posteriors are clearly bounded away from zero. We saw a
similar effect in the pulp
example. Although we have not matched up
the priors exactly, there does appear to be some structural difference.
Much of the worry above centers on the random effect SDs. The fixed
effects seem quite robust to these concerns. If we only care about
these, GINLA is giving us what we need with the minimum amount of effort
(we would not even need to install any packages beyond the default
distribution of R, though this is an historic advantage for mgcv
).
sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.4.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/London
tzcode source: internal
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] mgcv_1.8-42 nlme_3.1-162 brms_2.19.0 Rcpp_1.0.10 cmdstanr_0.5.3 knitr_1.43 INLA_23.05.30-1
[8] sp_2.0-0 foreach_1.5.2 RLRsim_3.1-8 pbkrtest_0.5.2 lme4_1.1-33 Matrix_1.5-4.1 ggplot2_3.4.2
[15] faraway_1.0.8
loaded via a namespace (and not attached):
[1] gridExtra_2.3 inline_0.3.19 rlang_1.1.1 magrittr_2.0.3 matrixStats_1.0.0
[6] compiler_4.3.1 loo_2.6.0 systemfonts_1.0.4 callr_3.7.3 vctrs_0.6.3
[11] reshape2_1.4.4 stringr_1.5.0 crayon_1.5.2 pkgconfig_2.0.3 fastmap_1.1.1
[16] backports_1.4.1 ellipsis_0.3.2 labeling_0.4.2 utf8_1.2.3 threejs_0.3.3
[21] promises_1.2.0.1 rmarkdown_2.22 markdown_1.7 ps_1.7.5 nloptr_2.0.3
[26] MatrixModels_0.5-1 purrr_1.0.1 xfun_0.39 jsonlite_1.8.5 later_1.3.1
[31] Deriv_4.1.3 prettyunits_1.1.1 broom_1.0.5 R6_2.5.1 dygraphs_1.1.1.6
[36] StanHeaders_2.26.27 stringi_1.7.12 boot_1.3-28.1 rstan_2.21.8 iterators_1.0.14
[41] zoo_1.8-12 base64enc_0.1-3 bayesplot_1.10.0 httpuv_1.6.11 splines_4.3.1
[46] igraph_1.5.0 tidyselect_1.2.0 rstudioapi_0.14 abind_1.4-5 yaml_2.3.7
[51] codetools_0.2-19 miniUI_0.1.1.1 processx_3.8.1 pkgbuild_1.4.1 lattice_0.21-8
[56] tibble_3.2.1 plyr_1.8.8 shiny_1.7.4 withr_2.5.0 bridgesampling_1.1-2
[61] posterior_1.4.1 coda_0.19-4 evaluate_0.21 RcppParallel_5.1.7 xts_0.13.1
[66] pillar_1.9.0 tensorA_0.36.2 stats4_4.3.1 checkmate_2.2.0 DT_0.28
[71] shinyjs_2.1.0 distributional_0.3.2 generics_0.1.3 rstantools_2.3.1 munsell_0.5.0
[76] scales_1.2.1 minqa_1.2.5 gtools_3.9.4 xtable_1.8-4 glue_1.6.2
[81] tools_4.3.1 shinystan_2.6.0 data.table_1.14.8 colourpicker_1.2.0 mvtnorm_1.2-2
[86] grid_4.3.1 tidyr_1.3.0 crosstalk_1.2.0 colorspace_2.1-0 cli_3.6.1
[91] fansi_1.0.4 svglite_2.1.1 Brobdingnag_1.2-9 dplyr_1.1.2 gtable_0.3.3
[96] digest_0.6.31 htmlwidgets_1.6.2 farver_2.1.1 htmltools_0.5.5 lifecycle_1.0.3
[101] mime_0.12 shinythemes_1.2.0 MASS_7.3-60