random_effects_overview – Yes! You can do that in `mgcv`!

There are many ways to fit random effects in mgcv. Which is “best” depends a little on the exact nature of the data that are being modelled and the structure of the random effect one needs. This document outlines (probably) all of the ways you can fit random effects in mgcv (and some outside of mgcv).

A quick bit of theory and some consequences for practice

Before getting into the details here, it’s worth noting that random effects and splines share an equivalence. We can view splines as random effects, the coefficients estimated for the spline are random effects: their associated penalties are effectively inverse variances (penalties give the structure to the variance, the smoothing parameters scale them). We can therefore go the other direction too, and get the penalty that corresponds to a given covariance matrix structure from a random effect, with the variance parameter itself being an inverse of the smoothing parameter.

This is all very convenient mathematically, but it can leave us in a bit of trouble computationally. A lot of problems where we have a random effect, it’s structure is sparse, that is: the covariance matrix tends to have a lot of zeros in it. We can see that from the structure of a simple random effect where we have one coefficient for each “block” in the data (e.g., individual, replicate or somesuch). In that case we have an identity matrix for the covariance matrix. Random effects software has taken advantage of this structure (knowing that in matrix-multiplications many entries can be ignored as they’re zero, saving time). Splines on the other hand, are dense: their penalty matrices contain almost no zeros, so computation must crank through all of the parts of a matrix-multiplication. This leaves us in a little bit of a conflict, as we want to take advantage of the speed that sparsity gives us while not performing any of the spline calculations incorrectly. There are additional repercussions here, as this issue can also lead to instability in the numerics of problems, giving potential convergence issues when fitting models that mix both splines and random effects. This has historically been a bit of an issue – it does’t always end badly, but it can be immensely frustrating when it does.

“Simple” random effects via `s(x, bs="re")`

The easiest way to get simple independent random effects into an mgcv model is to use the "re" basis. If we write s(x, bs="re") that will give us a random effect for x. If x is a factor variable this will give a random intercept model, whereas if x is a numeric variable, we have a random slope model. In both cases we assume a normal distribution for the coefficients, with unknown scale (variance) parameter which will then be estimated.

Since these random effects are fitted using the same construction as splines do (i.e., the random effects are “converted” to spline format to fit), the results that come out of the summary from a fitted model may not be that useful on first glance. However, we can convert the terms in our fitted GAM using the gam.vcomp function to obtain variance estimates as we would from other random effects software.

The random effects basis functions constructed using bs="re" allow for more than just one covariate to be included. For example, s(x,v,w,bs="re") would result in a random effect for the x:v:w interaction. These can be mixed factor/numeric variables.

Anecdotally, it seems that using marginal likelihood (method="ML") or restricted maximum likelihood (method="REML") leads to better convergence than other methods in mgcv.

For larger datasets, these terms can also be used in bam models.

Using`lme`-style random effects via `gamm`

The other way of specifying and fitting random effects is via the gamm function (as opposed to gam). Rather than moving our random effects into a spline formulation, gamm takes our splines and recasts them as random effects. Once the terms are constructed by mgcv, gamm then passes the random effects model to nlme for fitting. Specifying random effects in the model is now done via the random= argument to gamm, in the “classic” lme style. For a random effect per group g in the data, we can write random=~(1|g). More complicatedly: random=list(x1=~1, x2=~1|g) will (assuming x1 and g are factors and x2 is numeric) give a model with random intercepts for x1 and random slopes for x2, nested within g.

One of the confusions when fitting with gamm is the returned object is not a standard mgcv model. It is a list containing two elements: one is the lme object returned by nlme (giving the model as if all components were random effects) and the other is a normal gam return objects, which has the smoothers converted back to spline form for easier interpretation. These elements are named $lme and $gam, respectively.

In addition to lme-style random effects, gamm also allows the use of a correlation structure using the correlation= argument. This is specified as in lme.

Note that gamm doesn’t perform well when the response variable is binary. Overall it is less numerically stable than using gam, so can often crash or fail to converge. This is in part down to the use of penalized quasi-likelihood to fit the models.

Using `lme4`-style random effects with `gamm4`

Rather than using nlme for a back-end to fit random effects, we can use lme4. This is done via the gamm4 package, rather than mgcv. Instead of using PQL for fitting, gamm4 uses REML or ML. This has the advantage of being better for binary data, and tends to be more stable. Unfortunately, it doesn’t support correlation structures.

Writing models is as in gamm, via the random= argument and the resulting object is a list with $gam and $mer elements giving the GAM and GLMM interpretations of the model, respectively.

Completely custom random effects with`paraPen`

Finally, we can specify any random effects structure we like within gam/bam model using the paraPen approach. This allows us to specify a matrix in the formula for the model, then attach a penalty (or penalties) to that matrix covariate (we attach a Penalty to parametric terms in the model, hence paraPen). For example, we could construct the random effects design matrix, X say, using model.matrix then supply a diagonal matrix for the penalty/covariance for that term to get a simple random effect. That would be specified something like gam(y~X, paraPen=list(S=list(diag(Xn)))) (if Xn is the dimension of the random effects). This is a very flexible approach, but can be a bit fraught if you’re not 100% sure what you’re doing.

A quick bit of theory and some consequences for practice

“Simple” random effects via s(x, bs="re")

Usinglme-style random effects via gamm

Using lme4-style random effects with gamm4

Completely custom random effects withparaPen

Further reading

“Simple” random effects via `s(x, bs="re")`

Using`lme`-style random effects via `gamm`

Using `lme4`-style random effects with `gamm4`

Completely custom random effects with`paraPen`