There are many ways to fit random effects in mgcv
. Which is “best” depends a little on the exact nature of the data that are being modelled and the structure of the random effect one needs. This document outlines (probably) all of the ways you can fit random effects in mgcv
(and some outside of mgcv
).
A quick bit of theory and some consequences for practice
Before getting into the details here, it’s worth noting that random effects and splines share an equivalence. We can view splines as random effects, the coefficients estimated for the spline are random effects: their associated penalties are effectively inverse variances (penalties give the structure to the variance, the smoothing parameters scale them). We can therefore go the other direction too, and get the penalty that corresponds to a given covariance matrix structure from a random effect, with the variance parameter itself being an inverse of the smoothing parameter.
This is all very convenient mathematically, but it can leave us in a bit of trouble computationally. A lot of problems where we have a random effect, it’s structure is sparse, that is: the covariance matrix tends to have a lot of zeros in it. We can see that from the structure of a simple random effect where we have one coefficient for each “block” in the data (e.g., individual, replicate or somesuch). In that case we have an identity matrix for the covariance matrix. Random effects software has taken advantage of this structure (knowing that in matrix-multiplications many entries can be ignored as they’re zero, saving time). Splines on the other hand, are dense: their penalty matrices contain almost no zeros, so computation must crank through all of the parts of a matrix-multiplication. This leaves us in a little bit of a conflict, as we want to take advantage of the speed that sparsity gives us while not performing any of the spline calculations incorrectly. There are additional repercussions here, as this issue can also lead to instability in the numerics of problems, giving potential convergence issues when fitting models that mix both splines and random effects. This has historically been a bit of an issue – it does’t always end badly, but it can be immensely frustrating when it does.
“Simple” random effects via s(x, bs="re")
The easiest way to get simple independent random effects into an mgcv
model is to use the "re"
basis. If we write s(x, bs="re")
that will give us a random effect for x
. If x
is a factor variable this will give a random intercept model, whereas if x
is a numeric variable, we have a random slope model. In both cases we assume a normal distribution for the coefficients, with unknown scale (variance) parameter which will then be estimated.
Since these random effects are fitted using the same construction as splines do (i.e., the random effects are “converted” to spline format to fit), the results that come out of the summary
from a fitted model may not be that useful on first glance. However, we can convert the terms in our fitted GAM using the gam.vcomp
function to obtain variance estimates as we would from other random effects software.
The random effects basis functions constructed using bs="re"
allow for more than just one covariate to be included. For example, s(x,v,w,bs="re")
would result in a random effect for the x:v:w
interaction. These can be mixed factor/numeric variables.
Anecdotally, it seems that using marginal likelihood (method="ML"
) or restricted maximum likelihood (method="REML"
) leads to better convergence than other methods in mgcv
.
For larger datasets, these terms can also be used in bam
models.
See also ?smooth.construct.re.smooth.spec
.
Usinglme
-style random effects via gamm
The other way of specifying and fitting random effects is via the gamm
function (as opposed to gam
). Rather than moving our random effects into a spline formulation, gamm
takes our splines and recasts them as random effects. Once the terms are constructed by mgcv
, gamm
then passes the random effects model to nlme
for fitting. Specifying random effects in the model is now done via the random=
argument to gamm
, in the “classic” lme
style. For a random effect per group g
in the data, we can write random=~(1|g)
. More complicatedly: random=list(x1=~1, x2=~1|g)
will (assuming x1
and g
are factors and x2
is numeric) give a model with random intercepts for x1
and random slopes for x2
, nested within g
.
One of the confusions when fitting with gamm
is the returned object is not a standard mgcv
model. It is a list
containing two elements: one is the lme
object returned by nlme
(giving the model as if all components were random effects) and the other is a normal gam
return objects, which has the smoothers converted back to spline form for easier interpretation. These elements are named $lme
and $gam
, respectively.
In addition to lme
-style random effects, gamm
also allows the use of a correlation structure using the correlation=
argument. This is specified as in lme
.
Note that gamm
doesn’t perform well when the response variable is binary. Overall it is less numerically stable than using gam
, so can often crash or fail to converge. This is in part down to the use of penalized quasi-likelihood to fit the models.
See also ?gamm
.
Using lme4
-style random effects with gamm4
Rather than using nlme
for a back-end to fit random effects, we can use lme4
. This is done via the gamm4
package, rather than mgcv
. Instead of using PQL for fitting, gamm4
uses REML or ML. This has the advantage of being better for binary data, and tends to be more stable. Unfortunately, it doesn’t support correlation structures.
Writing models is as in gamm
, via the random=
argument and the resulting object is a list
with $gam
and $mer
elements giving the GAM and GLMM interpretations of the model, respectively.
See also ?gamm4
.
Completely custom random effects withparaPen
Finally, we can specify any random effects structure we like within gam
/bam
model using the paraPen
approach. This allows us to specify a matrix in the formula
for the model, then attach a penalty (or penalties) to that matrix covariate (we attach a Pen
alty to para
metric terms in the model, hence paraPen
). For example, we could construct the random effects design matrix, X
say, using model.matrix
then supply a diagonal matrix for the penalty/covariance for that term to get a simple random effect. That would be specified something like gam(y~X, paraPen=list(S=list(diag(Xn))))
(if Xn
is the dimension of the random effects). This is a very flexible approach, but can be a bit fraught if you’re not 100% sure what you’re doing.
See also ?gam.models
.
Further reading
- The
mgcv
manual pages?random.effects
and?gam.models
have lots of useful bits in them. - Gavin Simpson’s blog post Using random effects in GAMs with
mgcv
- My preprint on how we interpret GAMs in a Bayesian way