ONLINE SUPPLEMENT: Understanding different types of repeatability and intra-class correlation for an analysis of biological variation

Author

Shinichi Nakagawa, David F. Westneat, Ayumi Mizuno, Yimen G. Araya-Ajoy, Barbara Class, Niels Dingemanse, Ned A. Dochtermann, Malgorzata Lagisz, Kate L. Laskowski, Joel L. Pick, Denis Réale, Coralie Williams, Jonathan Wright and Holger Schielzeth

1 Introduction

This online material is a supplement to our paper “Understanding different types of repeatabilty and intra-class correlation for an analysis of biological variation”. We will show how to obtain different types of repeatability (intra-class correlation) using the lme4, rptR and partR2 packages in R.

2 Contents

In this online material, we will illustrate how to get repeatabilty for four traits: 1) body length of adult beetles (Gaussian distribution, N = 2,160), 2) frequencies of the two distinct male colour morphs (binomial or Bernoulli distribution, N = 1,080), 3) the number of eggs laid by each female (Poisson distribution) after random mating (N = 1,080) and 4) the activity levels (log-normally distributed) with three observations per individual (N = 6,480).

**Fig: An overview of experimental and sampling design**

After generating these data, we will:

Use lme4 to fit a full model, which match the data generating process, and a reduced model, which only fit the target clustering variable (e.g., Population).
Use rptR and partR2 to obtain variance component for each fixed and random effect including residuals.
By using estimated variance components, estimate different types of repeatability (e.g., adjusted, unadjusted and marginal repeatability)

In addition, we show how to obtain conditional repeatability for body length at different levels of moisture (0, 1 and 2).

Also, importantly, both rptR and partR2 packages have vignettes that provide detailed information on how to use them and associated publications.

The rptR package is available on CRAN and the vignette can be found here.

Stoffel, M. A., Nakagawa, S., & Schielzeth, H. (2017). rptR: Repeatability estimation and variance decomposition by generalized linear mixed‐effects models. Methods in ecology and evolution, 8(11), 1639-1644.

The partR2 package is available on CRAN and the vignette can be found here.

Stoffel, M. A., Nakagawa, S., & Schielzeth, H. (2021). partR2: Partitioning R2 in generalized linear mixed models. PeerJ, 9, e11414.

3 Prerequisites

3.1 Loading packages

Our tutorial uses R statistical software and existing R packages, which you will first need to download and install.

If the packages are archived in CRAN, use install.packages() to install them. For example, to install the lme4 , you can execute install.packages("lme4") in the console (bottom left pane of R Studio).

Version information of each package is listed at the end of this tutorial.

Code

# clearing memory
rm(list = ls())

# attempt to load or install necessary packages
if(!require(pacman)) install.packages("pacman")
pacman::p_load(lme4,
               lmerTest,
               rptR,
               partR2,
               pander,
               here,
               sessioninfo,
               details
)

4 Basic data and preparation

Here we create identifiers for Population, Container, Individual, Sex, Moisture and Student for 12 populations, 72 containers, 30 individuals per container, and 3 conditions (moisture levels). The data are generated at the individual level and then expanded to the observation level with three observations per individual.

Code

# number of things
n_populations <- 12 # number of populations
n_containers <- 72 # number of containers
n_samples <- 30 # number of samples per container
n_individuals <- 2160 # number of individuals

# Data at the individual level

# 12 different populations 
Population <- gl(n_populations, n_samples*2*3, n_individuals) # 2 sex and 3 conditions

# 120 containers (30 individuals in each container)
Container <- gl(n_containers, n_samples, n_individuals)

# Individual ID
Individual <- factor(1:n_individuals)

# Sex of the individuals. Uni-sex within each container (individuals are sorted at the pupa stage)
Sex <- factor(rep(c(rep("Female", n_samples), rep("Male", n_samples)), n_containers/2))

# Observer (person A and B)

Student <- factor(rep(c("A", "B"), each = n_individuals/2))

# Moisture levels
Moisture <- rep(rep(0:2, each= 60), n_populations)  # 3 levels of moisture (1, 2, 3)

# the first basic data set with matrix of 2160 x 6
dat1 <- data.frame(Population = Population, 
                   Individual = Individual, 
                   Container = Container, 
                   Sex = Sex, 
                   Moisture = Moisture, 
                   Student = Student)

# Data at the observation level
#Individual <- factor(rep(1:n_individuals, 3)) # 3 observations per individual

# Basic data set with a matrix of 6480 x 5 
dat2 <- rbind(dat1, dat1, dat1) # 3 observations per individual
dat2 <- dat2[, -6] # taking out student info (as activity level was measured by a machine)

5 Body lengh

Body length is a continuous trait that is normally distributed.

5.1 Data generation

Code

# Setting the seed
set.seed(22376)

# Simulate random effects for Populations
Population_intercept <- rnorm(n_populations, 0, sqrt(1.3)) # variance of 1.3
Population_slope     <- rnorm(n_populations, 0, sqrt(0.1)) # variance of 0.1

# Simulate random effect for containers (random intercept)
Container_intercept <- rnorm(n_containers, 0, sqrt(0.1)) # variance of 0.1

# Generate outcome (BodyL) with both random intercept and random slope for Moisture per population
dat1$BodyL <- 15 - 3 * (as.numeric(dat1$Sex) - 1) + 
  0.4 * dat1$Moisture +                                   # fixed effect of Moisture
  0.05 * (as.numeric(dat1$Student) - 1) +                 # fixed effect of Student
  Population_intercept[dat1$Population] +                 # random intercept: Population
  Population_slope[dat1$Population] * dat1$Moisture +     # random slope: Population x Moisture
  Container_intercept[dat1$Container] +                   # random intercept: Container
  rnorm(n_individuals, 0, sqrt(1))                        # residual (variance of 1)

5.2 Model fitting

We use the lmer function in lme4 to fix a ‘full’ model, which completely match with the data generating process. The model includes fixed effects of Sex, Moisture and Student, and random intercepts for Container and random slopes for Moisture by Population.

Our target clustering (grouping) variable is Population, for which we want to estimate repeatability.

Code

# fitting a mixed model
size_full <- lmer(BodyL ~ 1 + Sex + Moisture + Student + 
                    (1|Container) + (1 + Moisture|Population),
                  data = dat1)
summary(size_full)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: BodyL ~ 1 + Sex + Moisture + Student + (1 | Container) + (1 +  
##     Moisture | Population)
##    Data: dat1
## 
## REML criterion at convergence: 6230.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.1363 -0.6899  0.0214  0.6819  3.3819 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev. Corr 
##  Container  (Intercept) 0.08078  0.2842        
##  Population (Intercept) 3.04570  1.7452        
##             Moisture    0.10063  0.3172   -0.56
##  Residual               0.97053  0.9852        
## Number of obs: 2160, groups:  Container, 72; Population, 12
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept) 14.59595    0.66531 11.57950  21.938 8.58e-11 ***
## SexMale     -2.89866    0.07928 47.00097 -36.563  < 2e-16 ***
## Moisture     0.46650    0.10365 10.99947   4.501   0.0009 ***
## StudentB     1.39444    0.85635  9.99965   1.628   0.1345    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##          (Intr) SexMal Moistr
## SexMale  -0.060              
## Moisture -0.410  0.000       
## StudentB -0.644  0.000  0.000

We can also fit a reduced model, which only fit our target clustering variable, Population.

Code

# fitting a reduced model
size_reduced <- lmer(BodyL ~ 1 + (1|Population), 
                     data = dat1)
summary(size_reduced)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: BodyL ~ 1 + (1 | Population)
##    Data: dat1
## 
## REML criterion at convergence: 8802.1
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.55616 -0.79195 -0.00211  0.81433  2.56497 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  Population (Intercept) 2.508    1.584   
##  Residual               3.355    1.832   
## Number of obs: 2160, groups:  Population, 12
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  14.3103     0.4588 11.0000   31.19 4.37e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

5.3 Variance decomposition

We now want to get variance for each fixed and random effect, including residuals. We can use the rptR and partR2 package to do this.

Code

# the same model as `size_full`
size_rpt_full <- rpt(BodyL ~ 1 + Student + Sex + Moisture + 
                       (1 | Container) + (1 + Moisture| Population), 
                     grname = c("Container", "Population", "Fixed", "Residual"), 
                     data = dat1, 
                     datatype = "Gaussian",
                     nboot = 1000, 
                     npermut = 0, 
                     ratio = FALSE)

# saving
saveRDS(size_rpt_full, here("Rdata", "size_rpt_full.rds"))


# partR2 cannot process a random slope model so take out
# we are assuming that fixed variances are teh same with or without the random slope
size_full2 <- lmer(BodyL ~ 1 + Sex + Moisture + Student + 
                    (1|Container) + (1|Population),
                  data = dat1)

# No random slopes
size_part <- partR2(size_full2, partvars = c("Moisture", "Sex", "Student"),
                    nboot = NULL)

# saving
saveRDS(size_part, here("Rdata", "size_part.rds"))

Code

# reading in Rdata

size_rpt_full <- readRDS(here("Rdata", "size_rpt_full.rds"))
size_part <- readRDS(here("Rdata", "size_part.rds"))

size_rpt_full
## 
## 
## Variance estimation using the lmm method 
## 
## Variance of Container
## Var = 0.081
## SE = 0.023
## CI = [0.039, 0.13]
## P  = 4.89e-14 [LRT]
##      NA [Permutation]
## 
## Variance of Population
## Var = 2.591
## SE = 1.163
## CI = [0.826, 5.182]
## P  = 9.01e-31 [LRT]
##      NA [Permutation]
## 
## Variance of Residual
## Var = 0.971
## SE = 0.03
## CI = [0.911, 1.032]
## P  = NA [LRT]
##      NA [Permutation]
## 
## Variance of Fixed
## Var = 2.733
## SE = 0.759
## CI = [2.145, 4.783]
## P  = NA [LRT]
##      NA [Permutation]
size_part
## 
## 
## R2 (marginal) and 95% CI for the full model: 
##  R2     CI_lower CI_upper nboot ndf
##  0.4059 NA       NA       1     4  
## 
## ----------
## 
## Part (semi-partial) R2:
##  Predictor(s)         R2     CI_lower CI_upper nboot ndf
##  Model                0.4059 NA       NA       1     4  
##  Moisture             0.0239 NA       NA       1     3  
##  Sex                  0.3458 NA       NA       1     3  
##  Student              0.0362 NA       NA       1     3  
##  Moisture+Sex         0.3697 NA       NA       1     2  
##  Moisture+Student     0.0601 NA       NA       1     2  
##  Sex+Student          0.3820 NA       NA       1     2  
##  Moisture+Sex+Student 0.4059 NA       NA       1     1

Code

# the same model as `size_full`
size_rpt_reduced <- rpt(BodyL ~ 1 + (1|Population),
                     grname = c("Population", "Residual"), 
                     data = dat1, 
                     datatype = "Gaussian",
                     nboot = 1000, 
                     npermut = 0, 
                     ratio = FALSE)

# saving
saveRDS(size_rpt_reduced, here("Rdata", "size_rpt_reduced.rds"))

Code

# reading in Rdata
size_rpt_reduced <- readRDS(here("Rdata", "size_rpt_reduced.rds"))

size_rpt_reduced
## 
## 
## Variance estimation using the lmm method 
## 
## Variance of Population
## Var = 2.508
## SE = 1.084
## CI = [0.918, 5.072]
## P  = 1.05e-234 [LRT]
##      NA [Permutation]
## 
## Variance of Residual
## Var = 3.355
## SE = 0.1
## CI = [3.166, 3.554]
## P  = NA [LRT]
##      NA [Permutation]

5.4 Repeatability calculation

Our target clustering variable is Population, and we want to estimate repeatability (R) as the ratio of the variance of the target clustering variable (Population) to the total variance.

Here we have two kinds of effects: 1) effects which we want to remove (adjust) from repeatability calculation (Student and Container) and 2) effects which we want to retain (Sex, Moistureand Population).

Code

# this is mixture of Rv1, Rv2 and Rv3
# also, this repeatability is marginalized in relation to `Moisture`
# getting adjusted fixed effect variance.
# getting % of `Sex` and `Moisture in relation to the total fixed effect variance
adj_factor <-  size_part$R2[5 ,"estimate"][[1]]/ size_part$R2[8 ,"estimate"][[1]]

# getting % of 

# variance for `Sex`, `Moisture'
Fixed_point <- size_rpt_full$R[1, "Fixed"] * adj_factor 
Fixed_boot <- size_rpt_full$R_boot[, "Fixed"] * adj_factor

# getting % of 'Moisture' to the total fixed effect variance
adj_factor2 <-  size_part$R2[2 ,"estimate"][[1]]/ size_part$R2[8 ,"estimate"][[1]]

# variance for `Moisture'
Fixed_point2 <- size_rpt_full$R[1, "Fixed"] * adj_factor2 
Fixed_boot2 <- size_rpt_full$R_boot[, "Fixed"] * adj_factor2
  
# point estimate
(size_rpt_full$R[1, "Population"] +  Fixed_point)/ 
              (size_rpt_full$R[1, "Population"] + 
               Fixed_point +   
               size_rpt_full$R[1, "Residual"])
## [1] 0.8396141

# 95% CI using bootstrapping
CI <- (size_rpt_full$R_boot[, "Population"] + Fixed_boot) / 
  (size_rpt_full$R_boot[, "Population"] + 
   Fixed_boot + 
   size_rpt_full$R_boot[, "Residual"])

quantile(CI, c(0.025, 0.975))
##      2.5%     97.5% 
## 0.7700461 0.8936619


# repeatability - now removing `Sex` effect from numerator but not denominator
# point estimate
(size_rpt_full$R[1, "Population"] +  Fixed_point2)/ 
              (size_rpt_full$R[1, "Population"] + 
               Fixed_point +   
               size_rpt_full$R[1, "Residual"])
## [1] 0.4548239

# 95% CI using bootstrapping
CI <- (size_rpt_full$R_boot[, "Population"] + Fixed_boot2) / 
  (size_rpt_full$R_boot[, "Population"] + 
   Fixed_boot + 
   size_rpt_full$R_boot[, "Residual"])

quantile(CI, c(0.025, 0.975))
##      2.5%     97.5% 
## 0.2157872 0.6125273


# Extra - variance component for `Moisture` `Sex` and `Student`.

# Sex
Fixed_point - Fixed_point2
## [1] 2.328446

# Moisture
Fixed_point2
## [1] 0.1608182

# Student
size_rpt_full$R[1, "Fixed"] - Fixed_point
## [1] 0.2437726

Code

# ordinary repeatability

# point estimate
size_rpt_reduced$R[1, "Population"] / 
              (size_rpt_reduced$R[1, "Population"] + 
               size_rpt_reduced$R[1, "Residual"])
## [1] 0.4277224

# 95% CI using bootstrapping
CI <- size_rpt_reduced$R_boot[, "Population"] / 
  (size_rpt_reduced$R_boot[, "Population"] + 
   size_rpt_reduced$R_boot[, "Residual"])

quantile(CI, c(0.025, 0.975))
##      2.5%     97.5% 
## 0.2153815 0.5988730

6 Activity level

Activity level (log-transformed) is a continuous trait (i.e., mm traveled) that is log-normally distributed. We have three observations per individual.

6.1 Data generation

Code

# Setting the seed
set.seed(4242)

# Simulate random effects for Populations
Population_intercept <- rnorm(n_populations, 0, sqrt(1.3))
Population_slope     <- rnorm(n_populations, 0, sqrt(0.1)) # random slope variance (adjustable)

# Simulate random effect for containers (random intercept)
Container_intercept <- rnorm(n_containers, 0, sqrt(0.1))

# Individual intercept
Individual_intercept <- rnorm(n_individuals, 0, sqrt(2))



dat2$Activity <- exp(0.1 + 1.2 * (as.numeric(dat2$Sex) - 1) + 
                       0.3 * dat2$Moisture +                                   # fixed effect of Moisture
                       Population_intercept[dat2$Population] +                 # random intercept: Population
                       Population_slope[dat2$Population] * dat2$Moisture +     # random slope: Population x Moisture
                       Container_intercept[dat2$Container] +                   # random intercept: Container
                       Individual_intercept[dat2$Individual] +                 # random intercept: Individual
                       rnorm(n_individuals*3, 0, sqrt(0.5)) )                  # residual

6.2 Model fitting

We use lmer function in lme4 to fix a ‘full’ model, which completely match with the data generating process. The model includes fixed effects of Sex, Moisture, and random intercepts for Container and random slopes for Moisture by Population, and random intercepts for Individual.

Code

# fitting a mixed model

act_full <- lmer(log(Activity) ~ 1 + Sex +  Moisture  + 
               (1|Container) + (1 + Moisture|Population) + (1 | Individual), 
             data = dat2)
summary(act_full)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: 
## log(Activity) ~ 1 + Sex + Moisture + (1 | Container) + (1 + Moisture |  
##     Population) + (1 | Individual)
##    Data: dat2
## 
## REML criterion at convergence: 19620.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.5039 -0.5558 -0.0091  0.5678  2.8141 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev. Corr
##  Individual (Intercept) 1.9698   1.4035       
##  Container  (Intercept) 0.1846   0.4296       
##  Population (Intercept) 1.6600   1.2884       
##             Moisture    0.0801   0.2830   0.13
##  Residual               0.5072   0.7122       
## Number of obs: 6480, groups:  Individual, 2160; Container, 72; Population, 12
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept) -0.09765    0.38829 11.53788  -0.251   0.8059    
## SexMale      1.24965    0.11922 46.99876  10.481 6.87e-14 ***
## Moisture     0.26923    0.10957 11.00181   2.457   0.0318 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##          (Intr) SexMal
## SexMale  -0.154       
## Moisture -0.036  0.000

We can also fit a reduced model, which only fit our target clustering variable, Indiviudal.

Code

# fitting a reduced model
act_reduced <- lmer(log(Activity) ~ 1 + (1|Individual), 
                     data = dat2)
summary(act_reduced)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: log(Activity) ~ 1 + (1 | Individual)
##    Data: dat2
## 
## REML criterion at convergence: 21073.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.5923 -0.5556 -0.0123  0.5626  2.7947 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  Individual (Intercept) 4.3127   2.0767  
##  Residual               0.5072   0.7122  
## Number of obs: 6480, groups:  Individual, 2160
## 
## Fixed effects:
##              Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept) 7.964e-01  4.555e-02 2.159e+03   17.48   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

6.3 Variance decomposition

We now want to get variance for each fixed and random effect, including residuals. We can use the rptR and partR2 package to do this.

Code

# the same model as `size_full`
act_rpt_full <- rpt(log(Activity) ~ 1 + Sex +  Moisture + 
                      (1|Container) + (1 + Moisture|Population) + (1 | Individual), 
             grname = c("Container", "Population", "Individual", "Fixed", "Residual"), 
             data = dat2, 
             datatype = "Gaussian",
             nboot = 1000, 
             npermut = 1000, 
             ratio = FALSE)

# saving
saveRDS(act_rpt_full, here("Rdata", "act_rpt_full.rds"))


# partR2 cannot process a random slope model so take out
act_full2 <-lmer(log(Activity) ~ 1 + Sex +  Moisture  + 
               (1|Container) + (1|Population) + (1 | Individual), 
             data = dat2)

# No random slopes
# we are assuming that fixed variances are teh same with or without the random slope
act_part <- partR2(act_full2, partvars = c("Moisture", "Sex"),
                    nboot = NULL)

# saving
saveRDS(act_part, here("Rdata", "act_part.rds"))

Code

# reading in Rdata

act_rpt_full <- readRDS(here("Rdata", "act_rpt_full.rds"))
act_part <- readRDS(here("Rdata", "act_part.rds"))

act_rpt_full
## 
## 
## Variance estimation using the lmm method 
## 
## Variance of Container
## Var = 0.185
## SE = 0.05
## CI = [0.09, 0.285]
## P  = 1.14e-14 [LRT]
##      0.004 [Permutation]
## 
## Variance of Population
## Var = 1.885
## SE = 0.791
## CI = [0.69, 3.668]
## P  = 3.01e-20 [LRT]
##      0.324 [Permutation]
## 
## Variance of Individual
## Var = 1.97
## SE = 0.068
## CI = [1.835, 2.105]
## P  = 0 [LRT]
##      0.001 [Permutation]
## 
## Variance of Residual
## Var = 0.507
## SE = 0.011
## CI = [0.487, 0.529]
## P  = NA [LRT]
##      NA [Permutation]
## 
## Variance of Fixed
## Var = 0.439
## SE = 0.083
## CI = [0.303, 0.623]
## P  = NA [LRT]
##      NA [Permutation]
act_part
## 
## 
## R2 (marginal) and 95% CI for the full model: 
##  R2     CI_lower CI_upper nboot ndf
##  0.0881 NA       NA       1     3  
## 
## ----------
## 
## Part (semi-partial) R2:
##  Predictor(s) R2     CI_lower CI_upper nboot ndf
##  Model        0.0881 NA       NA       1     3  
##  Moisture     0.0097 NA       NA       1     2  
##  Sex          0.0784 NA       NA       1     2  
##  Moisture+Sex 0.0881 NA       NA       1     1

Code

# the same model as `size_full`
act_rpt_reduced <- rpt(log(Activity) ~ 1 + (1 | Individual), 
                       grname = c("Individual", "Residual"), 
                       data = dat2, 
                       datatype = "Gaussian",
                       nboot = 1000, 
                       npermut = 0,
                       ratio = FALSE)


# saving
saveRDS(act_rpt_reduced, here("Rdata", "act_rpt_reduced.rds"))

Code

# reading in Rdata
act_rpt_reduced <- readRDS(here("Rdata", "act_rpt_reduced.rds"))

act_rpt_reduced
## 
## 
## Variance estimation using the lmm method 
## 
## Variance of Individual
## Var = 4.313
## SE = 0.133
## CI = [4.047, 4.583]
## P  = 0 [LRT]
##      NA [Permutation]
## 
## Variance of Residual
## Var = 0.507
## SE = 0.011
## CI = [0.486, 0.528]
## P  = NA [LRT]
##      NA [Permutation]

6.4 Repeatability calculation

Our target clustering variable is individual, and we want to estimate repeatability (R) as the ratio of the variance of the target clustering variable (individual) to the total variance.

Here we have two kinds of effects: 1) effects which we want to remove (adjust) from repeatability calculation (Container) and 2) effects which we want to retain (Sex, Moisture, Population and individual).

Code

# this is mixture of Rv1, Rv2 and Rv3
# also, this repeatability is marginalized in relation to `Moisture`
# getting adjusted fixed effect variance.


# getting % of 'Moisture' to the total fixed effect variance
adj_factor3 <-  act_part$R2[2 ,"estimate"][[1]]/ act_part$R2[4 ,"estimate"][[1]]

# variance for `Moisture'
Fixed_point3 <- act_rpt_full$R[1, "Fixed"] * adj_factor3 
Fixed_boot3 <- act_rpt_full$R_boot[, "Fixed"] * adj_factor3
  
# point estimate
(act_rpt_full$R[1, "Individual"] +  act_rpt_full$R[1, "Population"] + act_rpt_full$R[1, "Fixed"])/ 
              (act_rpt_full$R[1, "Individual"] +
               act_rpt_full$R[1, "Population"]+  
               act_rpt_full$R[1, "Fixed"] +   
               act_rpt_full$R[1, "Residual"])
## [1] 0.8943532

# 95% CI using bootstrapping
CI <- (act_rpt_full$R_boot[, "Individual"] + act_rpt_full$R_boot[, "Population"] + act_rpt_full$R_boot[, "Fixed"])/ 
  (act_rpt_full$R_boot[, "Individual"] + 
   act_rpt_full$R_boot[, "Population"] +   
   act_rpt_full$R_boot[, "Fixed"] + 
   act_rpt_full$R_boot[, "Residual"])

quantile(CI, c(0.025, 0.975))
##      2.5%     97.5% 
## 0.8575955 0.9236931


# repeatability - now removing `Sex` effect from numerator but not denominator
# point estimate
# point estimate
(act_rpt_full$R[1, "Individual"] +  act_rpt_full$R[1, "Population"] + Fixed_point3)/ 
              (act_rpt_full$R[1, "Individual"] +
               act_rpt_full$R[1, "Population"]+  
               act_rpt_full$R[1, "Fixed"] +   
               act_rpt_full$R[1, "Residual"])
## [1] 0.8130215

# 95% CI using bootstrapping
CI <- (act_rpt_full$R_boot[, "Individual"] + act_rpt_full$R_boot[, "Population"] + Fixed_boot3)/ 
  (act_rpt_full$R_boot[, "Individual"] + 
   act_rpt_full$R_boot[, "Population"] +   
   act_rpt_full$R_boot[, "Fixed"] + 
   act_rpt_full$R_boot[, "Residual"])

quantile(CI, c(0.025, 0.975))
##      2.5%     97.5% 
## 0.7446819 0.8672014

# Extra - variance component for `Moisture` and `Sex`. 

# Moisture
Fixed_point3
## [1] 0.04833146

# Sex
act_rpt_full$R[1, "Fixed"] - Fixed_point3
## [1] 0.3904651

Code

# ordinary repeatability

# point estimate
act_rpt_reduced$R[1, "Individual"] / 
              (act_rpt_reduced$R[1, "Individual"] + 
               act_rpt_reduced$R[1, "Residual"])
## [1] 0.8947696

# 95% CI using bootstrapping
CI <- act_rpt_reduced$R_boot[, "Individual"] / 
  (act_rpt_reduced$R_boot[, "Individual"] + 
   act_rpt_reduced$R_boot[, "Residual"])

quantile(CI, c(0.025, 0.975))
##      2.5%     97.5% 
## 0.8869256 0.9018396

7 Male morph

The frequencies of the two distinct male colour morphs: black (0) or reddish (1) is a binomial or Bernoulli trait.

7.1 Data generation

Code

# Setting the seed
set.seed(123)

# Subset the design matrix (only males express colour morphs)
datm <- dat1[dat1$Sex == "Male", ]

# Simulate random effects for Populations
Population_intercept <- rnorm(n_populations, 0, sqrt(1.2))

# Simulate random effect for containers (random intercept)
Container_intercept <- rnorm(n_containers, 0, sqrt(0.1))

# generation of response values on latent scale (!) based on fixed effects,
# random effects and residual errors
ColourL <- 0.8  + 
  (-0.6) * datm$Moisture +                                   # fixed effect of Moisture
  Population_intercept[datm$Population] +                 # random intercept: Population
  Container_intercept[datm$Container]                   # random intercept: Container
  
# data generation (on data scale!) based on binomial distribution
datm$Colour <- rbinom(length(ColourL), 1, plogis(ColourL))

7.2 Model fitting

We use glmer function in lme4 to fix a ‘full’ model, which completely match with the data generating process. The model includes a fixed effect of Moisture, and random intercepts for Container and Population.

Code

# fitting a mixed model
morph_full <- glmer(Colour ~ 1 + Moisture + 
                    (1|Container) + (1|Population),
                  family = binomial(link = "logit"),
                  data = datm)
summary(morph_full)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: Colour ~ 1 + Moisture + (1 | Container) + (1 | Population)
##    Data: datm
## 
##       AIC       BIC    logLik -2*log(L)  df.resid 
##    1278.8    1298.7    -635.4    1270.8      1076 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.8762 -0.7389  0.3922  0.7478  2.1894 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  Container  (Intercept) 0.2030   0.4505  
##  Population (Intercept) 0.8095   0.8997  
## Number of obs: 1080, groups:  Container, 36; Population, 12
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   1.0734     0.3091   3.473 0.000515 ***
## Moisture     -0.6605     0.1282  -5.153 2.56e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##          (Intr)
## Moisture -0.425

We can also fit a reduced model, which only fit our target clustering variable, Population.

Code

# fitting a reduced model
morph_reduced <- glmer(Colour ~ 1 + (1|Population), 
                     family = binomial(link = "logit"),
                     data = datm)
summary(morph_reduced)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: Colour ~ 1 + (1 | Population)
##    Data: datm
## 
##       AIC       BIC    logLik -2*log(L)  df.resid 
##    1344.6    1354.6    -670.3    1340.6      1078 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.5536 -0.9101  0.4732  0.9286  1.5800 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  Population (Intercept) 0.7524   0.8674  
## Number of obs: 1080, groups:  Population, 12
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.3952     0.2596   1.522    0.128

7.3 Variance decomposition

We now want to get variance for each fixed and random effect, including residuals. We can use just rptR as we only have one fixed effect so fixed effects do not need to be decomposed further.

Code

# the same model as `size_full`
morph_rpt_full <- rpt(Colour ~ 1 + Moisture + 
                       (1 | Container) + (1 | Population), 
                     grname = c("Container", "Population", "Fixed", "Residual"), 
                     data = datm, 
                     datatype = "Binary",
                     nboot = 1000, 
                     npermut = 0, 
                     ratio = FALSE)

# saving
saveRDS(morph_rpt_full, here("Rdata", "morph_rpt_full.rds"))

Code

# reading in Rdata

morph_rpt_full <- readRDS(here("Rdata", "morph_rpt_full.rds"))

morph_rpt_full
## 
## 
## Variance estimation using the glmm method and logit link 
## 
## Variance of Container
## --------------------------------
## Link-scale approximation:
## Var = 0.203
## SE = 0.114
## CI = [0.002, 0.455]
## P  = 0.000755 [LRT]
##      0.001 [Permutation]
## 
## Variance of Population
## --------------------------------
## Link-scale approximation:
## Var = 0.81
## SE = 0.381
## CI = [0.184, 1.635]
## P  = 2.15e-05 [LRT]
##      0.001 [Permutation]
## 
## Variance of Residual
## --------------------------------
## Link-scale approximation:
## Var = 4.107
## SE = 0.166
## CI = [4, 4.586]
## P  = NA [LRT]
##      NA [Permutation]
## 
## Variance of Fixed
## --------------------------------
## Link-scale approximation:
## Var = 0.291
## SE = 0.115
## CI = [0.12, 0.561]
## P  = NA [LRT]
##      NA [Permutation]

Code

# the same model as `size_full`
morph_rpt_reduced <- rpt(Colour ~ 1 + (1|Population),
                     grname = c("Population", "Residual"), 
                     data = datm, 
                     datatype = "Binary",
                     nboot = 1000, 
                     npermut = 0, 
                     ratio = FALSE)

# saving
saveRDS(morph_rpt_reduced, here("Rdata", "morph_rpt_reduced.rds"))

Code

# reading in Rdata
morph_rpt_reduced <- readRDS(here("Rdata", "morph_rpt_reduced.rds"))

morph_rpt_reduced
## 
## 
## Variance estimation using the glmm method and logit link 
## 
## Variance of Population
## --------------------------------
## Link-scale approximation:
## Var = 0.752
## SE = 0.345
## CI = [0.186, 1.548]
## P  = 4.48e-30 [LRT]
##      NA [Permutation]
## 
## Variance of Residual
## --------------------------------
## Link-scale approximation:
## Var = 4.107
## SE = 0.181
## CI = [4.001, 4.636]
## P  = NA [LRT]
##      NA [Permutation]

7.4 Repeatability calculation

Our target clustering variable is Population, and we want to estimate repeatability (R) as the ratio of the variance of the target clustering variable (Population) to the total variance.

Here we have two kinds of effects: 1) effects which we want to remove (adjust) from repeatability calculation (Container) and 2) effects which we want to retain (Moistureand Population).

Code

# this is mixture of Rv1 and Rv3

# point estimate
(morph_rpt_full$R[2, "Population"] +  morph_rpt_full$R[2, "Fixed"])/ 
              (morph_rpt_full$R[2, "Population"] + 
               morph_rpt_full$R[2, "Fixed"] + 
               morph_rpt_full$R[2, "Residual"])
## [1] 0.2113642

# 95% CI using bootstrapping

CI <- (morph_rpt_full$R_boot_link$Population + morph_rpt_full$R_boot_link$Fixed) / 
  (morph_rpt_full$R_boot_link$Population + 
   morph_rpt_full$R_boot_link$Fixed + 
   morph_rpt_full$R_boot_link$Residual)
quantile(CI, c(0.025, 0.975))
##       2.5%      97.5% 
## 0.09260895 0.32533574

# Instead of using `Residual`, we could use pi^2/3, which is theoretical distribution-specific variance for family = binomial(logit)

# point estimate
(morph_rpt_full$R[2, "Population"] +  morph_rpt_full$R[2, "Fixed"])/ 
              (morph_rpt_full$R[2, "Population"] + 
               morph_rpt_full$R[2, "Population"] + 
               (pi^2)/3 )
## [1] 0.2242083

# 95% CI using bootstrapping
CI <- (morph_rpt_full$R_boot_link$Population + morph_rpt_full$R_boot_link$Fixed) / 
  (morph_rpt_full$R_boot_link$Population + 
   morph_rpt_full$R_boot_link$Fixed + 
    (pi^2)/3 )
quantile(CI, c(0.025, 0.975))
##     2.5%    97.5% 
## 0.116277 0.377135

Code

# ordinary repeatability

# point estimate
morph_rpt_reduced$R[2, "Population"] / 
              (morph_rpt_reduced$R[2, "Population"] + 
               morph_rpt_reduced$R[2, "Residual"])
## [1] 0.1548503

# 95% CI using bootstrapping
CI <- morph_rpt_reduced$R_boot_link$Population / 
  (morph_rpt_reduced$R_boot_link$Population + 
   morph_rpt_reduced$R_boot_link$Residual)
quantile(CI, c(0.025, 0.975))
##       2.5%      97.5% 
## 0.04342031 0.26678571


# Instead of using `Residual`, we could use pi^2/3, which is theoretical distribution-specific variance for family = binomial(logit)

# point estimate
morph_rpt_reduced$R[2, "Population"] / 
              (morph_rpt_reduced$R[2, "Population"] + 
                (pi^2)/3 )
## [1] 0.186137

# 95% CI using bootstrapping
CI <- morph_rpt_reduced$R_boot_link$Population / 
  (morph_rpt_reduced$R_boot_link$Population + 
   (pi^2)/3 )
quantile(CI, c(0.025, 0.975))
##       2.5%      97.5% 
## 0.05346076 0.32003024

8 The number of eggs

The number of eggs laid by each female is Poisson-distributed.

8.1 Data generation

Code


# Setting the seed
set.seed(7777)

# Subset the design matrix (only females lay eggs)
datf <- dat1[dat1$Sex == "Female", ]

# Simulate random effects for Populations
Population_intercept <- rnorm(n_populations, 0, sqrt(0.4))

# Simulate random effect for containers (random intercept)
Container_intercept <- rnorm(n_containers, 0, sqrt(0.05))

# generation of response values on latent scale (!) based on fixed effects,
# random effects and residual errors
EggL <- 0.1  + 
  (0.4) * datf$Moisture +                                   # fixed effect of Moisture
  Population_intercept[datf$Population] +                 # random intercept: Population
  Container_intercept[datf$Container] +                   # random intercept: Container
  rnorm(n_individuals/2, 0, sqrt(1))    
# data generation (on data scale!) based on Poisson distribution
datf$Egg <- rpois(length(EggL), exp(EggL))

8.2 Model fitting

One note here is that we have to model overdispersion by fitting observation-level random effects (OLREs) for Individual in addition to the random intercepts for Container and Population. See the following papers for overdispersion and OLREs

Harrison, X. A. (2014). Using observation-level random effects to model overdispersion in count data in ecology and evolution. PeerJ, 2, e616.

Nakagawa, S., & Schielzeth, H. (2010). Repeatability for Gaussian and non‐Gaussian data: a practical guide for biologists. Biological Reviews, 85(4), 935-956.

Code

# fitting a mixed model
# Note ``
egg_full <- glmer(Egg ~ 1 + Moisture + 
                      (1|Container) + (1|Population) + (1|Individual), 
                    data = datf, 
                    family = poisson)

summary(egg_full)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: Egg ~ 1 + Moisture + (1 | Container) + (1 | Population) + (1 |  
##     Individual)
##    Data: datf
## 
##       AIC       BIC    logLik -2*log(L)  df.resid 
##    5512.1    5537.1   -2751.1    5502.1      1075 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.25085 -0.55220 -0.09204  0.20005  1.05208 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  Individual (Intercept) 1.06570  1.0323  
##  Container  (Intercept) 0.08617  0.2935  
##  Population (Intercept) 0.41207  0.6419  
## Number of obs: 1080, groups:  Individual, 1080; Container, 36; Population, 12
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  0.37155    0.21088   1.762   0.0781 .  
## Moisture     0.47777    0.07573   6.309 2.81e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##          (Intr)
## Moisture -0.372

We can also fit a reduced model, which fit our target clustering variable, Population as well as Individual to account for overdispersion.

Code

egg_reduced <- glmer(Egg ~ 1 + (1|Population) + (1|Individual), 
                    data = datf, 
                    family = poisson)

summary(egg_reduced)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: Egg ~ 1 + (1 | Population) + (1 | Individual)
##    Data: datf
## 
##       AIC       BIC    logLik -2*log(L)  df.resid 
##    5623.3    5638.2   -2808.6    5617.3      1077 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.12131 -0.54957 -0.04214  0.20807  0.73381 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev.
##  Individual (Intercept) 1.2531   1.1194  
##  Population (Intercept) 0.4373   0.6613  
## Number of obs: 1080, groups:  Individual, 1080; Population, 12
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   0.8543     0.1955   4.371 1.24e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

8.3 Variance decomposition

We now want to get variance for each fixed and random effect, including residuals. We can use just rptR as we only have one fixed effect so fixed effects do not need to be decomposed further.

Code

# the same model as `egg_full` yet we do not need to fit `Individual` as OLREs
egg_rpt_full <- rpt(Egg ~ 1 + Moisture + 
                       (1 | Container) + (1 | Population), 
                     grname = c("Container", "Population", "Fixed", "Residual"), 
                     data = datf, 
                     datatype = "Poisson",
                     nboot = 1000, 
                     npermut = 0, 
                     ratio = FALSE)

# saving
saveRDS(egg_rpt_full, here("Rdata", "egg_rpt_full.rds"))

Code

# reading in Rdata

egg_rpt_full <- readRDS(here("Rdata", "egg_rpt_full.rds"))

egg_rpt_full
## 
## 
## Variance estimation using the glmm method and log link 
## 
## Variance of Container
## --------------------------------
## Link-scale approximation:
## Var = 0.086
## SE = 0.039
## CI = [0.013, 0.168]
## P  = 2.95e-05 [LRT]
##      NA [Permutation]
## 
## Variance of Population
## --------------------------------
## Link-scale approximation:
## Var = 0.412
## SE = 0.18
## CI = [0.077, 0.762]
## P  = 1.6e-06 [LRT]
##      NA [Permutation]
## 
## Variance of Residual
## --------------------------------
## Link-scale approximation:
## Var = 1.235
## SE = 0.073
## CI = [1.065, 1.353]
## P  = NA [LRT]
##      NA [Permutation]
## 
## Variance of Fixed
## --------------------------------
## Link-scale approximation:
## Var = 0.152
## SE = 0.05
## CI = [0.071, 0.264]
## P  = NA [LRT]
##      NA [Permutation]

Code

# the same model as `size_full`
egg_rpt_reduced <- rpt(Egg ~ 1 + (1|Population),
                     grname = c("Population", "Residual"), 
                     data = datf, 
                     datatype = "Poisson",
                     nboot = 1000, 
                     npermut = 0, 
                     ratio = FALSE)

# saving
saveRDS(egg_rpt_reduced, here("Rdata", "egg_rpt_reduced.rds"))

Code

# reading in Rdata
egg_rpt_reduced <- readRDS(here("Rdata", "egg_rpt_reduced.rds"))

egg_rpt_reduced
## 
## 
## Variance estimation using the glmm method and log link 
## 
## Variance of Population
## --------------------------------
## Link-scale approximation:
## Var = 0.437
## SE = 0.183
## CI = [0.105, 0.82]
## P  = 7.12e-40 [LRT]
##      NA [Permutation]
## 
## Variance of Residual
## --------------------------------
## Link-scale approximation:
## Var = 1.423
## SE = 0.079
## CI = [1.232, 1.549]
## P  = NA [LRT]
##      NA [Permutation]

8.4 Repeatability calculation

Our target clustering variable is Population, and we want to estimate repeatability (R) as the ratio of the variance of the target clustering variable (Population) to the total variance.

Here we have two kinds of effects: 1) effects which we want to remove (adjust) from repeatability calculation (Container) and 2) effects which we want to retain (Moistureand Population; Note Individual would be included in Residual).

Code

# this is mixture of Rv1 and Rv3

# point estimate
(egg_rpt_full$R[2, "Population"] +  egg_rpt_full$R[2, "Fixed"])/ 
              (egg_rpt_full$R[2, "Population"] + 
               egg_rpt_full$R[2, "Fixed"] + 
               egg_rpt_full$R[2, "Residual"])
## [1] 0.3136216

# 95% CI using bootstrapping

CI <- (egg_rpt_full$R_boot_link$Population + egg_rpt_full$R_boot_link$Fixed) / 
  (egg_rpt_full$R_boot_link$Population + 
   egg_rpt_full$R_boot_link$Fixed + 
   egg_rpt_full$R_boot_link$Residual)
quantile(CI, c(0.025, 0.975))
##      2.5%     97.5% 
## 0.1540324 0.4429450

Code

# ordinary repeatability

# point estimate
egg_rpt_reduced$R[2, "Population"] / 
              (egg_rpt_reduced$R[2, "Population"] + 
               egg_rpt_reduced$R[2, "Residual"])
## [1] 0.2351311

# 95% CI using bootstrapping
CI <- egg_rpt_reduced$R_boot_link$Population / 
  (egg_rpt_reduced$R_boot_link$Population + 
   egg_rpt_reduced$R_boot_link$Residual)
quantile(CI, c(0.025, 0.975))
##       2.5%      97.5% 
## 0.07134811 0.37619849

9 Bonus 1: Conditional repeatability (body length)

The formula for conditional repeatability is (see the paper for more details):

\[ R_{conditional} = \frac{\sigma_{u_{0}}^2 + \sigma_{u_{1}}^2 x_{1^*} + 2\rho\sigma_{u_{0}}\sigma_{u_{1}} x_{1^*}}{\sigma_{u_{0}}^2 + \sigma_{u_{1}}^2 x_{1^*} + 2\rho\sigma_{u_{0}}\sigma_{u_{1}} x_{1^*} + \sigma^2_\epsilon} \] Here we will use the model we used earlier size_full (the Body length model). Also, for simplicity, we will focus on Population and Residual, ignoring all the fixed effects and container (i.e, we calculate ‘adjusted’ repeatability). In the above formula, the variance for Population is \(\sigma_{u_{0}}^2\) while the variance for Residual is \(\sigma^2_\epsilon\). The variance for Moisture is \(\sigma_{u_{1}}^2\) and the covariance between Population and Moisture is \(\sigma^2_\epsilon\).

This is the model output

Code

summary(size_full)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: BodyL ~ 1 + Sex + Moisture + Student + (1 | Container) + (1 +  
##     Moisture | Population)
##    Data: dat1
## 
## REML criterion at convergence: 6230.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.1363 -0.6899  0.0214  0.6819  3.3819 
## 
## Random effects:
##  Groups     Name        Variance Std.Dev. Corr 
##  Container  (Intercept) 0.08078  0.2842        
##  Population (Intercept) 3.04570  1.7452        
##             Moisture    0.10063  0.3172   -0.56
##  Residual               0.97053  0.9852        
## Number of obs: 2160, groups:  Container, 72; Population, 12
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept) 14.59595    0.66531 11.57950  21.938 8.58e-11 ***
## SexMale     -2.89866    0.07928 47.00097 -36.563  < 2e-16 ***
## Moisture     0.46650    0.10365 10.99947   4.501   0.0009 ***
## StudentB     1.39444    0.85635  9.99965   1.628   0.1345    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##          (Intr) SexMal Moistr
## SexMale  -0.060              
## Moisture -0.410  0.000       
## StudentB -0.644  0.000  0.000

We have Moisture with 0, 1 and 2 and we can get 3 repeatability estimates using the formula above.

Code


sigma2_pop_int <- VarCorr(size_full)$Population[1, 1] # variance for Population intercept

sigma2_pop_slope <- VarCorr(size_full)$Population[2, 2] # variance for Population slope 

sigma_pop_cov <- VarCorr(size_full)$Population[1, 2] # covariance between Population intercept and slope

sigma2_residual <- attr(VarCorr(size_full), "sc")^2 # residual variance

# repeatability for `Moisture` = 0

R_cond_0 <- (sigma2_pop_int + sigma2_pop_slope * 0 + 2 * sigma_pop_cov * 0) / 
  (sigma2_pop_int + sigma2_pop_slope * 0 + 2 * sigma_pop_cov * 0 + sigma2_residual)

# repeatability for `Moisture` = 1

R_cond_1 <- (sigma2_pop_int + sigma2_pop_slope * 1 + 2 * sigma_pop_cov * 1) / 
  (sigma2_pop_int + sigma2_pop_slope * 1 + 2 * sigma_pop_cov * 1 + sigma2_residual)

# repeatability for `Moisture` = 2

R_cond_2 <- (sigma2_pop_int + sigma2_pop_slope * 2 + 2 * sigma_pop_cov * 2) / 
  (sigma2_pop_int + sigma2_pop_slope * 2 + 2 * sigma_pop_cov * 2 + sigma2_residual)

# printing out
R_cond_0
## [1] 0.7583483
R_cond_1
## [1] 0.7223008
R_cond_2
## [1] 0.6736132

10 Bonus 2: Creating the same dataset using squidSim

Code

# installing and loading squidSim - https://github.com/squidgroup/squidSim
#devtools::install_github("squidgroup/squidSim")
library(squidSim)
library(lme4)

## First Example

### DATA STRUCTURE

## Two students, each do 6 populations (12 in total), each with 30 individuals, in each combination of the two sexes and three moisture treatments (giving unique individuals 2160 in total). The unique combination of Population, Sex and Moisture is the Container (72 in total) and the unique combination of ID within a container, Sex and Moisture is the Individual.

ds1 <- make_structure("Student(2)/Population(6)/ID(30)
  + Sex(2)+ Moisture(3) 
  + Population:Sex:Moisture 
  + ID:Sex:Moisture",
  repeat_obs=1,
  level_names=list(
    Student=c("A","B"), 
    Sex=c("Female","Male"),
    Moisture=0:2
  ),
  int_names = c("Container","Individual") 
)
head(ds1)

### DATA GENERATION

full_sim1 <- simulate_population(
  seed=22376,
  data_structure = ds1,
  response_name="BodyL",
  parameters = list(
    intercept= 15,
    Population = list(
      names = c("Population_intercept","Population_slope"),
      vcov = c(1.3,0.1),
      beta= c(1,0) # slopes not included in main additive part
    ),
    Container = list(
      names=c("Container_intercept"),
      vcov=0.1
    ),
    Sex = list(
      fixed=TRUE, # tells squidSim it is a factor
      names = c("Female","Male"),
      beta = c(0,-3) # Female is baseline, then difference between baseline Female and Male
      ),
    Student = list(
      fixed=TRUE, # tells squidSim it is a factor
      names = c("A","B"),
      beta = c(0,0.05) # A is baseline, then difference between baseline A and B
      ),
    Moisture = list(
      covariate=TRUE, # tells squidSim it is continuous (not a grouping factor)
      names = c("Moisture_cont"),
      beta = c(0.4)
      ),
    interactions = list( 
      names = "Population_slope:Moisture_cont", #random slopes
      beta=1
    ),
    residual=list(
      vcov = 1
    )
  )
)
dat1 <- get_population_data(full_sim1)

size_full <- lmer(BodyL ~ 1 + Sex + Moisture + Student + 
                    (1|Container) + (1 + Moisture|Population),
                  data = dat1)


summary(size_full)


## EXAMPLE 2 - Activity level

ds2 <- make_structure("Population(12)/ID(30)
  + Sex(2)+ Moisture(3) 
  + Population:Sex:Moisture 
  + ID:Sex:Moisture",
  repeat_obs=3, ## now with 3 obs per individual
  level_names=list(
    Sex=c("Female","Male"),
    Moisture=0:2
  ),
  int_names = c("Container","Individual") 
)


full_sim2 <- simulate_population(
  seed=4242,
  data_structure = ds2,
  response_name="Activity",
  link="log",
  parameters = list(
    intercept= 0.1,
    Population = list(
      names = c("Population_intercept","Population_slope"),
      vcov = c(1.3,0.1),
      beta= c(1,0) # slopes not included in main additive part
    ),
    Container = list(
      vcov=0.1
    ),
    Individual = list(
      vcov=2
    ),
    Sex = list(
      fixed=TRUE, # tells squidSim it is a factor
      names = c("Female","Male"),
      beta = c(0,1.2) # Female is baseline, then difference between baseline Female and Male
      ),
    Moisture = list(
      covariate=TRUE, # tells squidSim it is continuous (not a grouping factor)
      names = c("Moisture_cont"),
      beta = c(0.3)
      ),
    interactions = list( 
      names = "Population_slope:Moisture_cont", #random slopes
      beta=1
    ),
    residual=list(
      vcov = 0.5
    )
  )
)
dat2 <- get_population_data(full_sim2)

act_full <- lmer(log(Activity) ~ 1 + Sex +  Moisture  + 
               (1|Container) + (1 + Moisture|Population) + (1 | Individual), 
             data = dat2)
summary(act_full)


## example 3 - Male Morph

ds_m <- ds1[ds1$Sex == "Male", ]

full_sim3 <- simulate_population(
  seed=125,
  data_structure = ds_m,
  response_name="Colour",
  link="logit",
  family="binomial",
  parameters = list(
    intercept= 0.8,
    Population = list(
      vcov = 1.2
    ),
    Container = list(
      vcov=0.1
    ),
    Moisture = list(
      covariate=TRUE, # tells squidSim it is continuous (not a grouping factor)
      beta = c(-0.6)
      ),
    residual=list(
      vcov = 0
    )
  )
)
datm <- get_population_data(full_sim3)


morph_full <- glmer(Colour ~ 1 + Moisture + 
                    (1|Container) + (1|Population),
                  family = binomial(link = "logit"),
                  data = datm)
summary(morph_full)


## example 4 - Egg number

ds_f <- ds1[ds1$Sex == "Female", ]

full_sim4 <- simulate_population(
  seed=125,
  data_structure = ds_f,
  response_name="Egg",
  link="log",
  family="poisson",
  parameters = list(
    intercept= 0.1,
    Population = list(
      vcov = 0.4
    ),
    Container = list(
      vcov=0.05
    ),
    Moisture = list(
      covariate=TRUE, # tells squidSim it is continuous (not a grouping factor)
      beta = c(0.4)
      ),
    residual=list(
      vcov = 1
    )
  )
)
datf <- get_population_data(full_sim4)

egg_full <- glmer(Egg ~ 1 + Moisture + 
                      (1|Container) + (1|Population) + (1|Individual), 
                    data = datf, 
                    family = poisson)

summary(egg_full)

11 Software and package versions

Code

sessioninfo::session_info() %>%
  details::details(
    summary = 'Current session info',
    open    = F
  ) %>% pander()
## Warning in pander.default(.): No pander.method for "details_console", reverting
## to default.

Current session info


─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.5.0 (2025-04-11)
 os       macOS 26.1
 system   aarch64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/Edmonton
 date     2025-11-20
 pandoc   3.6.3 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/aarch64/ (via rmarkdown)
 quarto   1.7.32 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/quarto

─ Packages ───────────────────────────────────────────────────────────────────
 package      * version    date (UTC) lib source
 boot           1.3-31     2024-08-28 [2] CRAN (R 4.5.0)
 cli            3.6.5      2025-04-23 [2] CRAN (R 4.5.0)
 clipr          0.8.0      2022-02-22 [2] CRAN (R 4.5.0)
 desc           1.4.3      2023-12-10 [2] CRAN (R 4.5.0)
 details      * 0.4.0      2025-02-09 [2] CRAN (R 4.5.0)
 digest         0.6.37     2024-08-19 [2] CRAN (R 4.5.0)
 dplyr          1.1.4      2023-11-17 [2] CRAN (R 4.5.0)
 evaluate       1.0.4      2025-06-18 [2] CRAN (R 4.5.0)
 farver         2.1.2      2024-05-13 [2] CRAN (R 4.5.0)
 fastmap        1.2.0      2024-05-15 [2] CRAN (R 4.5.0)
 generics       0.1.4      2025-05-09 [2] CRAN (R 4.5.0)
 ggplot2        3.5.2      2025-04-09 [2] CRAN (R 4.5.0)
 glue           1.8.0      2024-09-30 [2] CRAN (R 4.5.0)
 gtable         0.3.6      2024-10-25 [2] CRAN (R 4.5.0)
 here         * 1.0.1      2020-12-13 [2] CRAN (R 4.5.0)
 htmltools      0.5.8.1    2024-04-04 [2] CRAN (R 4.5.0)
 htmlwidgets    1.6.4      2023-12-06 [2] CRAN (R 4.5.0)
 httr           1.4.7      2023-08-15 [2] CRAN (R 4.5.0)
 jsonlite       2.0.0      2025-03-27 [2] CRAN (R 4.5.0)
 knitr          1.50       2025-03-16 [2] CRAN (R 4.5.0)
 lattice        0.22-7     2025-04-02 [2] CRAN (R 4.5.0)
 lifecycle      1.0.4      2023-11-07 [2] CRAN (R 4.5.0)
 lme4         * 1.1-37     2025-03-26 [2] CRAN (R 4.5.0)
 lmerTest     * 3.1-3      2020-10-23 [2] CRAN (R 4.5.0)
 magrittr       2.0.3      2022-03-30 [2] CRAN (R 4.5.0)
 MASS           7.3-65     2025-02-28 [2] CRAN (R 4.5.0)
 Matrix       * 1.7-3      2025-03-11 [2] CRAN (R 4.5.0)
 minqa          1.2.8      2024-08-17 [2] CRAN (R 4.5.0)
 nlme           3.1-168    2025-03-31 [2] CRAN (R 4.5.0)
 nloptr         2.2.1      2025-03-17 [2] CRAN (R 4.5.0)
 numDeriv       2016.8-1.1 2019-06-06 [2] CRAN (R 4.5.0)
 pacman       * 0.5.1      2019-03-11 [2] CRAN (R 4.5.0)
 pander       * 0.6.6      2025-03-01 [2] CRAN (R 4.5.0)
 partR2       * 0.9.2      2024-03-04 [2] CRAN (R 4.5.0)
 pillar         1.11.0     2025-07-04 [2] CRAN (R 4.5.0)
 pkgconfig      2.0.3      2019-09-22 [2] CRAN (R 4.5.0)
 png            0.1-8      2022-11-29 [2] CRAN (R 4.5.0)
 R6             2.6.1      2025-02-15 [2] CRAN (R 4.5.0)
 rbibutils      2.3        2024-10-04 [2] CRAN (R 4.5.0)
 RColorBrewer   1.1-3      2022-04-03 [2] CRAN (R 4.5.0)
 Rcpp           1.1.0      2025-07-02 [1] CRAN (R 4.5.0)
 Rdpack         2.6.4      2025-04-09 [2] CRAN (R 4.5.0)
 reformulas     0.4.1      2025-04-30 [2] CRAN (R 4.5.0)
 rlang          1.1.6      2025-04-11 [2] CRAN (R 4.5.0)
 rmarkdown      2.29       2024-11-04 [2] CRAN (R 4.5.0)
 rprojroot      2.1.0      2025-07-12 [2] CRAN (R 4.5.0)
 rptR         * 0.9.23     2025-04-29 [2] CRAN (R 4.5.0)
 rstudioapi     0.17.1     2024-10-22 [2] CRAN (R 4.5.0)
 scales         1.4.0      2025-04-24 [2] CRAN (R 4.5.0)
 sessioninfo  * 1.2.3      2025-02-05 [2] CRAN (R 4.5.0)
 tibble         3.3.0      2025-06-08 [2] CRAN (R 4.5.0)
 tidyselect     1.2.1      2024-03-11 [2] CRAN (R 4.5.0)
 vctrs          0.6.5      2023-12-01 [2] CRAN (R 4.5.0)
 withr          3.0.2      2024-10-28 [2] CRAN (R 4.5.0)
 xfun           0.52       2025-04-02 [2] CRAN (R 4.5.0)
 yaml           2.3.10     2024-07-26 [2] CRAN (R 4.5.0)

 [1] /Users/z3437171/Library/R/arm64/4.5/library
 [2] /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library
 * ── Packages attached to the search path.

──────────────────────────────────────────────────────────────────────────────