Contrasts in Multilevel/Mixed Models

Regression (and multilevel/mixed models) gives an estimate of the effect of an independent variable on the dependent variable.

When estimating the effect of a categorical independent variable, the mixed model will estimate the effect of each of its levels, and tell us if each effect is significantly different from zero. This is useful information, however, most researcher will be interested in knowing if the levels of the independent variable differ from each other.

Taking the sleepstudy example again, and consider Days as a categorical rather than a continous variable, and ask the question does Reaction on day 1 differ from Reaction on day 2/3/4 etc?

Implementing the formula

The first step is to change the format of Days from continuous to factor. Second, when running the model, we will remove the fixed intercept, by subtracting 1 from the formula. This subtraction does not change the result but will make the successive operations a little easier as it provides the complete display of all the factor levels.

# install.packages("lmerTest")
# install.packages("knitr")
# install.packages("multcomp")
# install.packages("dplyr")
# install.packages("ggplot2")
data("sleepstudy")
# Load packages
packs <- c("lmerTest", "knitr", "multcomp", "dplyr", "ggplot2")
lapply(packs, require, character.only = TRUE)

This is the data as we already have visualised it before:

kable(head(sleepstudy))
ReactionDaysSubject
249.56000308
258.70471308
250.80062308
321.43983308
356.85194308
414.69015308

But we will change the format of Days to categorical, transforming it into a type of variable that is called factor in R. The levels of a factor are its possible values, in this case, Day 0, 1, 2, 3, 4 etc.

# Change the data type of Days from number (continous) to factor (categorical)
sleepstudy$Days <- factor(as.character(sleepstudy$Days), 
                          levels = unique(sleepstudy$Days))
levels(sleepstudy$Days)

[1] “0” “1” “2” “3” “4” “5” “6” “7” “8” “9”

And re-fit the model:

# Fit the model
m <- lmer(Reaction ~ Days - 1 + ( 1 | Subject), data=sleepstudy)
kable(coef(summary(m)))
EstimateStd. Errordft valueP-value
Days0256.651811.4577841.9829722.39978<2e-16 ***
Days1264.495811.4577841.9829723.08438<2e-16 ***
Days2265.361911.4577841.9829723.15997<2e-16 ***
Days3282.992011.4577841.9829724.69867<2e-16 ***
Days4288.649411.4577841.9829725.19243<2e-16 ***
Days5308.518511.4577841.9829726.92654<2e-16 ***
Days6312.178311.4577841.9829727.24596<2e-16 ***
Days7318.750611.4577841.9829727.81957<2e-16 ***
Days8336.629511.4577841.9829729.37999<2e-16 ***
Days9350.851211.4577841.9829730.62122<2e-16 ***

The output shows that each Day separately has a very significant effect on Reaction.

But do the Days actually differ from each other? The easiest way of comparing factors between each other is to choose a reference level. In this example, we will assume the first day of the experiment, Day 1, as a reference, and compare all the other days to Day 1.

Building the Contrasts Matrix

First, we need to compute all the combinations of the levels of the factor Day in a Contrasts Matrix:

group <- paste0(sleepstudy$Days)
group <- aggregate(model.matrix(m) ~ group, FUN=mean)
# group
rownames(group) <- group$group
(group <- group[,-1])
kable(group)
Days0Days1Days2Days3Days4Days5Days6Days7Days8Days9
01000000000
10100000000
20010000000
30001000000
40000100000
50000010000
60000001000
70000000100
80000000010
90000000001

The 1s specify when a level is active - given that we only have one factor, when one level is active (1), all the others are inactive (0). The matrix becomes more complicated if you have more than one factor. We can now specify the custom contrasts we want to compute between the levels, in this case, all levels against Day 1 of the experiment:

contrasts <- rbind(group["1",] - group["0",],
                   group["1",] - group["2",],
                   group["1",] - group["3",],
                   group["1",] - group["4",],
                   group["1",] - group["5",],
                   group["1",] - group["6",],
                   group["1",] - group["7",],
                   group["1",] - group["8",],
                   group["1",] - group["9",])
kable(contrasts)
Days0Days1Days2Days3Days4Days5Days6Days7Days8Days9
1-1100000000
1101-10000000
12010-1000000
130100-100000
1401000-10000
15010000-1000
160100000-100
1701000000-10
18010000000-1

Now the matrix specifies couples of contrasts that we want to estimate from the model.

Extracting the contrasts from the model

For extracting the contrasts from the model, we are going to use the glht function from the multcomp package.

library(multcomp)

# Format the matrix for better readibility
contrast.matrix <- rbind("Day 1 versus Day 0"=as.numeric(contrasts[1,]),
                         "Day 1 versus Day 2"=as.numeric(contrasts[2,]),
                         "Day 1 versus Day 3"=as.numeric(contrasts[3,]),
                         "Day 1 versus Day 4"=as.numeric(contrasts[4,]),
                         "Day 1 versus Day 5"=as.numeric(contrasts[5,]),
                         "Day 1 versus Day 6"=as.numeric(contrasts[6,]),
                         "Day 1 versus Day 7"=as.numeric(contrasts[7,]),
                         "Day 1 versus Day 8"=as.numeric(contrasts[8,]),
                         "Day 1 versus Day 9"=as.numeric(contrasts[9,]))

comparisons <- summary(glht(m, contrast.matrix))

comparisons

The output shows that Reaction is not significantly different from baseline (Day 0) on Day 1. Also, Reaction only starts to differ from baseline since Day 5, and the difference grows progressively until Day 9.

Let’s format the results of the contrasts test into a table:

pq <- comparisons$test

mtests <- data.frame(pq$coefficients, pq$sigma, pq$tstat, pq$pvalues)
error <- attr(pq$pvalues, "error")
pname <- switch(comparisons$alternativ, 
                less = paste("Pr(<", 
                             ifelse(comparisons$df ==0, "z", "t"), ")", 
                             sep = ""), 
                 greater = paste("Pr(>", 
                                 ifelse(comparisons$df == 0, "z", "t"), ")", 
                                 sep = ""), 
                 two.sided = paste("Pr(>|", 
                                   ifelse(comparisons$df == 0, "z", "t"), "|)"
                                   , sep = ""))                                   
colnames(mtests) <- c("Estimate", "Std.Error", 
                      ifelse(comparisons$df ==0, "zvalue", "tvalue"), pname)

mtests <- mtests %>%
  tibble::rownames_to_column("Comparison") %>%
  mutate(`Pr(>|z|)`=ifelse(`Pr(>|z|)`< 0.001, 
                           "< 0.001", 
                           ifelse(`Pr(>|z|)` < 0.01, 
                                  "< 0.01",
                                  ifelse(`Pr(>|z|)` < 0.05, 
                                         "< 0.05",
                                         paste(round(`Pr(>|z|)`,
                                                     4)))))) %>%
  mutate_if(is.numeric, funs(round(.,
                                   digits=2)))
kable(mtests)
EstimateStd.ErrorzvaluePr(>|z|)
Day 1 versus Day 07.843950010.475310.74880380.9827613
Day 1 versus Day 2-0.866144410.47531-0.08268441.0000000
Day 1 versus Day 3-18.496255610.47531-1.76570060.3815733
Day 1 versus Day 4-24.153666710.47531-2.30577180.1301341
Day 1 versus Day 5-44.022700010.47531-4.20252130.0002050
Day 1 versus Day 6-47.682500010.47531-4.55189530.0000436
Day 1 versus Day 7-54.254827810.47531-5.17930680.0000019
Day 1 versus Day 8-72.133750010.47531-6.88607510.0000000
Day 1 versus Day 9-86.355466710.47531-8.24371710.0000000

Next Steps

Contrasts can be customised depending on the research question at hand, and much more complicated contrasts matrix are often needed.

Another scenario when this procedure may turn out useful is when an interaction is present. In that case, the contrasts matrix will indicate the combination between two different factors. We will explore an example with an interaction in the next post.

Teresa Del Bianco
Teresa Del Bianco
Postdoctoral Researcher

Scientist researching brain and child development and neurodiversity.

Related