When variance carries signal • drmTMB

This tutorial starts with a biological question: do two habitats differ only in mean growth, or does one habitat also make growth less predictable? In a Gaussian location-scale model, the location part models the expected response mu, while the scale part models the residual standard deviation sigma. That residual scale is not just a nuisance parameter; it can be the scientific answer when the question is about individual variability, predictability, or heterogeneous heterogeneity.

If you are fitting your first model, start with the worked growth example, then return to the syntax overview below when you want to adapt the formula.

The first implemented drmTMB model class is Gaussian location-scale regression with fixed effects, optional random effects in the location formula, residual-scale random intercepts and independent random slopes in the sigma formula, and random-effect scale formulas such as sd(id) ~ x_group. The same pattern will be used throughout the documentation: write the model symbolically first, then show the matching R syntax, fit the model, and interpret the fitted output.

Current syntax:

drmTMB(
  bf(y ~ x1, sigma ~ x2),
  family = gaussian(),
  data = dat
)

Random-intercept location syntax:

drmTMB(
  bf(y ~ x1 + (1 | id), sigma ~ x1),
  family = gaussian(),
  data = dat
)

Simple random-slope location syntax:

drmTMB(
  bf(y ~ x1 + (0 + x1 | id), sigma ~ x1),
  family = gaussian(),
  data = dat
)

Residual-scale random-effect syntax:

drmTMB(
  bf(y ~ x1 + (1 | id), sigma ~ x2 + (1 | id) + (0 + w | id)),
  family = gaussian(),
  data = dat
)

The companion article “Which scale are you modelling?” gives a side-by-side guide to sigma ~ x, sigma ~ x + (1 | id), sigma ~ x + (0 + w | id), sd(id) ~ x_group, random-slope correlations, and bivariate residual rho12.

Model equations and matching R syntax

The basic discipline is simple: each estimated parameter gets its own linear predictor. For Gaussian location-scale regression those parameters are the mean, mu, and the residual standard deviation, sigma.

Throughout this article, Normal(a, b) uses variance as the second argument. Thus Normal(mu_i, sigma_i^2) means a Gaussian distribution with mean mu_i and standard deviation sigma_i.

For the fixed-effect Gaussian location-scale model, the symbolic model is:

$\begin{aligned} y_i \mid \mu_i, \sigma_i &\sim \operatorname{Normal}(\mu_i, \sigma_i^2),\\ \mu_i &= \beta_0 + \beta_1 x_{1i},\\ \log(\sigma_i) &= \gamma_0 + \gamma_1 x_{2i}. \end{aligned}$

The matching drmTMB syntax is:

drmTMB(
  bf(y ~ x1, sigma ~ x2),
  family = gaussian(),
  data = dat
)

The location formula y ~ x1 defines the design matrix for the beta coefficients. The scale formula sigma ~ x2 defines the design matrix for the gamma coefficients. The log link makes sigma_i positive:

$\sigma_i = \exp(\gamma_0 + \gamma_1 x_{2i}).$

The same model can be written more compactly as:

$\begin{aligned} \eta_{\mu i} &= X_{\mu}[i, ]\beta_{\mu},\\ \eta_{\sigma i} &= X_{\sigma}[i, ]\beta_{\sigma},\\ \mu_i &= \eta_{\mu i},\\ \sigma_i &= \exp(\eta_{\sigma i}). \end{aligned}$

The mapping is:

Symbol	Meaning	R syntax source
`y_i`	observed response	left-hand side of `y ~ x1`
`X_mu`	design matrix for the mean	right-hand side of `y ~ x1`
`beta_mu`	mean coefficients	`coef(fit, "mu")`
`mu_i`	fitted mean	`fitted(fit)` or `predict(fit, dpar = "mu")`
`X_sigma`	design matrix for the residual scale	right-hand side of `sigma ~ x2`
`beta_sigma`	log-scale coefficients	`coef(fit, "sigma")`
`sigma_i`	residual standard deviation	`sigma(fit)` or `predict(fit, dpar = "sigma")`

With a random intercept in the location part, the model becomes:

$\begin{aligned} y_{ij} \mid \mu_{ij}, \sigma_{ij} &\sim \operatorname{Normal}(\mu_{ij}, \sigma_{ij}^2),\\ \mu_{ij} &= \beta_0 + \beta_1 x_{1ij} + b_j,\\ \log(\sigma_{ij}) &= \gamma_0 + \gamma_1 x_{2ij},\\ b_j &\sim \operatorname{Normal}(0, sd_{\mu,id}^2). \end{aligned}$

and the matching syntax is:

drmTMB(
  bf(y ~ x1 + (1 | id), sigma ~ x2),
  family = gaussian(),
  data = dat
)

Here sigma_ij is the residual standard deviation. The random-intercept scale sd_mu_id is a separate group-level standard deviation. In implemented double-hierarchical scale models, sd(id) ~ x_group exposes this group-level scale component by name; it should not be confused with residual sigma.

Residual-scale random intercepts enter a different part of the model:

$\begin{aligned} y_{ij} \mid \mu_{ij}, \sigma_{ij} &\sim \operatorname{Normal}(\mu_{ij}, \sigma_{ij}^2),\\ \mu_{ij} &= X_{\mu}[ij, ]\beta_{\mu} + b_j,\\ \log(\sigma_{ij}) &= X_{\sigma}[ij, ]\beta_{\sigma} + a_j,\\ b_j &\sim \operatorname{Normal}(0, sd_{\mu,id}^2),\\ a_j &\sim \operatorname{Normal}(0, sd_{\sigma,id}^2). \end{aligned}$

The matching syntax is:

drmTMB(
  bf(y ~ x1 + (1 | id), sigma ~ x2 + (1 | id) + (0 + w | id)),
  family = gaussian(),
  data = dat
)

Here a_j is residual-scale heterogeneity, or group-to-group variation in within-group variability. It is still not the same quantity as sd_mu_id, the standard deviation of the mu random intercepts.

A double-hierarchical scale formula targets the random-effect standard deviation itself. For example:

$\begin{aligned} y_{ij} \mid \mu_{ij}, \sigma_{ij}, b_j &\sim \operatorname{Normal}(\mu_{ij}, \sigma_{ij}^2),\\ \mu_{ij} &= \beta_0 + \beta_1 x_{1ij} + b_j,\\ \log(\sigma_{ij}) &= \gamma_0 + \gamma_1 x_{2ij},\\ b_j &= sd_{\mu,id,j} u_j,\\ u_j &\sim \operatorname{Normal}(0, 1),\\ \log(sd_{\mu,id,j}) &= \alpha_0 + \alpha_1 x_{\text{group},j}. \end{aligned}$

The matching syntax is:

drmTMB(
  bf(
    y ~ x1 + (1 | id),
    sigma ~ x2,
    sd(id) ~ x_group
  ),
  family = gaussian(),
  data = dat
)

This model has one residual scale, sigma_ij, and one group-level scale, sd_mu_id,j. They answer different questions. The sigma formula asks whether observations are more or less variable around their fitted mean. The sd(id) formula asks whether among-group differences in the mean are larger for some types of groups.

With two additive random intercepts, the equation has two group-level scale components:

$\begin{aligned} y_i \mid \mu_i, \sigma_i &\sim \operatorname{Normal}(\mu_i, \sigma_i^2),\\ \mu_i &= X_{\mu}[i, ]\beta_{\mu} + b_{\text{site}[i]} + b_{\text{observer}[i]},\\ \log(\sigma_i) &= X_{\sigma}[i, ]\beta_{\sigma},\\ b_{\text{site},k} &\sim \operatorname{Normal}(0, sd_{\mu,site}^2),\\ b_{\text{observer},l} &\sim \operatorname{Normal}(0, sd_{\mu,observer}^2). \end{aligned}$

and the matching syntax is:

drmTMB(
  bf(y ~ x1 + (1 | site) + (1 | observer), sigma ~ x2),
  family = gaussian(),
  data = dat
)

With a simple random slope, the random effect multiplies the predictor value:

$\begin{aligned} y_{ij} \mid \mu_{ij}, \sigma_{ij} &\sim \operatorname{Normal}(\mu_{ij}, \sigma_{ij}^2),\\ \mu_{ij} &= \beta_0 + \beta_1 x_{1ij} + x_{1ij} b_j,\\ \log(\sigma_{ij}) &= \gamma_0 + \gamma_1 x_{2ij},\\ b_j &\sim \operatorname{Normal}(0, sd_{\mu,x,id}^2). \end{aligned}$

and the matching syntax is:

drmTMB(
  bf(y ~ x1 + (0 + x1 | id), sigma ~ x2),
  family = gaussian(),
  data = dat
)

The current independent random-intercept and random-slope syntax writes the two group-level components separately:

drmTMB(
  bf(y ~ x1 + (1 | id) + (0 + x1 | id), sigma ~ x2),
  family = gaussian(),
  data = dat
)

This is intentionally not the same as a correlated random intercept/slope block. The ordinary correlated block is also implemented:

drmTMB(
  bf(y ~ x1 + (1 + x1 | id), sigma ~ x2),
  family = gaussian(),
  data = dat
)

The intercept-slope correlation from (1 + x1 | id) is a group-level correlation reported under corpars$mu. A labelled version is also implemented for univariate Gaussian mu:

drmTMB(
  bf(y ~ x1 + (1 + x1 | p | id), sigma ~ x2),
  family = gaussian(),
  data = dat
)

Here p is retained as a group-level covariance-block label. Matching intercept-only mu and sigma terms can now use the same label for the first mean-scale covariance block; larger cross-formula and bivariate blocks remain future work.

This is where ordinary language can become slippery: the model may contain several scale quantities, but only one of them is the residual sigma. The others should be named by what they scale:

residual SD: sigma_i, from sigma ~ x2;
site random-intercept SD in mu: sd_mu_site, from the site random intercept term in the location formula;
observer random-intercept SD in mu: sd_mu_observer, from the observer random intercept term in the location formula;
ID random-slope SD in mu: sd_mu_x_id, from a random-slope term such as (0 + x1 | id);
ID random intercept-slope correlation in mu: corpars$mu, from a correlated block such as (1 + x1 | id);
random-effect scale model: sd(site)_k, from syntax such as sd(site) ~ x_group when the target is an unlabelled Gaussian mu random intercept.

When the model includes a correlated block such as (1 + x1 | p | id), corpairs(fit, level = "group") returns the ordinary group-level correlation row. For this one-slope block the row has class mean-slope: it asks whether groups with higher baseline mu also tend to have steeper or shallower mean-model slopes. It is not rho12, because rho12 belongs only to bivariate residual response-response coupling.

For a biological reaction-norm question, let x1 be centred temperature and id be population. The fixed slope beta_1 is the average temperature effect. The random-intercept SD asks how much populations differ at the centred temperature. The random-slope SD asks how much populations differ in thermal plasticity. The intercept-slope correlation asks whether populations with high baseline response tend to have steeper or shallower thermal slopes:

$\begin{aligned} \mu_{ij} &= \beta_0 + \beta_1 temperature_{ij} + b_{0j} + temperature_{ij} b_{1j},\\ \begin{bmatrix}b_{0j}\\ b_{1j}\end{bmatrix} &\sim \operatorname{Normal}\left( \begin{bmatrix}0\\0\end{bmatrix}, \Sigma_{\text{population}} \right). \end{aligned}$

The matching fitted ordinary syntax is:

drmTMB(
  bf(y ~ temperature + (1 + temperature | p | population), sigma ~ habitat),
  family = gaussian(),
  data = dat
)

Read the fitted object as sdpars$mu for the baseline and slope SDs, corpairs(fit, class = "mean-slope") for the group-level intercept-slope correlation, and profile_targets(fit) for the direct interval target names.

The location formula models the conditional mean, mu, on the identity scale. The scale formula models the residual standard deviation, sigma, on the log scale:

$\begin{aligned} y_i \mid \mu_i, \sigma_i &\sim \operatorname{Normal}(\mu_i, \sigma_i^2),\\ \mu_i &= X_{\mu}\beta_{\mu} + b_{group[i]},\\ \log(\sigma_i) &= X_{\sigma}\beta_{\sigma},\\ b_{\text{group}} &= sd_{\text{group}} u_{\text{group}},\\ u_{\text{group}} &\sim \operatorname{Normal}(0, 1). \end{aligned}$

This means coefficients returned by coef(fit, "sigma") are log-standard deviation effects. Exponentiating a sigma coefficient gives the multiplicative change in residual standard deviation for a one-unit change in that predictor. For example, a sigma coefficient of 0.3 means exp(0.3) = 1.35, or about a 35% higher residual standard deviation.

For biological interpretation, keep the slope-like quantities separate and translate each one to the scale on which it answers the biological question:

Model component	Example syntax	Response-scale reading	Biological sentence
fixed mean slope	`growth ~ temperature`	`beta_temperature` is an additive change in expected growth per temperature unit	average growth changes with temperature
fixed residual-scale slope	`sigma ~ temperature`	`exp(gamma_temperature)` is the residual-SD ratio per temperature unit; `exp(2 * gamma_temperature)` is the residual-variance ratio	growth becomes more or less predictable along the temperature gradient
mean random-slope SD	`(0 + temperature \| population)` or `(1 + temperature \| population)`	`sd_mu_temperature_population` is the SD of population-specific temperature slopes, in growth-per-temperature units	populations differ in thermal reaction norms
residual-scale random-slope SD	`sigma ~ (0 + temperature \| population)`	the SD is on the log-residual-SD slope scale; `exp(sd)` and `exp(-sd)` give a rough one-SD spread in residual-SD ratios per temperature unit	populations differ in how predictability changes with temperature
random-effect scale slope	`sd(population) ~ habitat`	`exp(alpha_habitat)` is the ratio of among-population SDs; `exp(2 * alpha_habitat)` is the corresponding among-population variance ratio	among-population differences in expected growth are larger in one habitat

The first two rows are fixed-effect slopes. The next two rows are variance components for group-specific slopes. The last row is a direct model for a group-level SD. A single ecological phrase such as “temperature-dependent variability” can point to any of these, so name the model component before stating the biological conclusion. Current sd(population) ~ habitat syntax targets an unlabelled random intercept; coefficient-specific syntax such as sd(population, dpar = "mu", coef = "temperature") ~ habitat is reserved until random-slope SD regression has likelihood code and simulation tests.

Name the biological response before the symbol

When adapting the syntax, replace placeholders such as y, x1, and x2 with trait names before writing the interpretation. A useful tutorial sentence defines the measured response, the covariates, the distributional parameters, and the biological meaning of each slope.

For example, the parrot example in the phylogenetic location-scale paper asks whether forest habitat changes the average beak length of parrot species, the among-species heterogeneity in beak length, or both. Written in the non-phylogenetic Gaussian syntax used in this tutorial, the teaching version is:

$\begin{aligned} \text{beak\_length}_i &\sim \operatorname{Normal}(\mu_i, \sigma_i^2),\\ \mu_i &= \beta_0^{(\mu)} + \beta_{\text{mass}}^{(\mu)}\text{body\_mass}_i + \beta_{\text{forest}}^{(\mu)}\text{forest}_i,\\ \log(\sigma_i) &= \beta_0^{(\sigma)} + \beta_{\text{mass}}^{(\sigma)}\text{body\_mass}_i + \beta_{\text{forest}}^{(\sigma)}\text{forest}_i . \end{aligned}$

drmTMB(
  bf(beak_length ~ body_mass + forest, sigma ~ body_mass + forest),
  family = gaussian(),
  data = parrots
)

Quantity	Definition in the parrot example	Biological reading
`beak_length_i`	measured, centred, or log-centred beak length for species `i`	the focal morphological trait
`body_mass_i`	centred log body mass for species `i`	a size covariate, so the beak-length comparison is not just body-size allometry
`forest_i`	habitat indicator for species `i`, commonly 1 for forest-living and 0 for non-forest species	the ecological contrast
`mu_i`	expected beak length for species `i` after conditioning on body mass and habitat	the location, or mean trait value
`sigma_i`	residual standard deviation of beak length for species `i` after conditioning on body mass and habitat	the scale, or remaining among-species heterogeneity
`beta_forest^(mu)`	forest versus non-forest contrast in `mu`	whether forest species have longer or shorter expected beaks
`beta_forest^(sigma)`	forest versus non-forest contrast in `log(sigma)`	whether forest species are more or less heterogeneous in beak length

Report the scale coefficient on the scale that matches the claim: exp(beta_forest^(sigma)) is the residual-SD ratio for forest versus non-forest species, while exp(2 * beta_forest^(sigma)) is the residual-variance ratio. A mean effect and a scale effect are different biological claims. The first says forest habitat shifts average beak length; the second says forest habitat changes beak-length predictability or evolvability after the mean model has been accounted for.

The same naming rule works for behavioural or life-history examples. If the response is aggressiveness and the predictor is life_history_pace, then beta_life_history_pace^(mu) is the expected change in mean aggressiveness, whereas beta_life_history_pace^(sigma) is the change in residual predictability of aggressiveness. If the location formula also contains (0 + life_history_pace | population), the random-slope SD is a third quantity: it measures how much populations differ in the life-history slope of aggressiveness.

For effect-size synthesis, use the same habit of defining every symbol, but keep the teaching path in the separate meta-analysis article. There the known sampling variance or covariance is V, and the fitted sigma model describes extra heterogeneity after V has been included.

Worked example: growth mean and predictability

Suppose an ecologist measures juvenile growth in two habitats and records the temperature at each observation. The location question is whether mean growth differs between habitats and changes with temperature. The scale question is whether residual growth variability differs between habitats after accounting for those mean effects.

For this example, the fitted model is:

$\begin{aligned} \text{growth}_i &\sim \operatorname{Normal}(\mu_i, \sigma_i^2), \\ \mu_i &= \beta_0 + \beta_1 \mathbb{1}(\text{habitat}_i = \text{grassland}) + \beta_2 \text{temperature}_i, \\ \log(\sigma_i) &= \gamma_0 + \gamma_1 \mathbb{1}(\text{habitat}_i = \text{grassland}). \end{aligned}$

The matching drmTMB syntax is:

drmTMB(
  drm_formula(growth ~ habitat + temperature, sigma ~ habitat),
  family = gaussian(),
  data = dat
)

The code below simulates one dataset with the same structure. Simulated data keep the vignette small and reproducible; a real analysis would replace this block with field or laboratory measurements.

set.seed(42)
n <- 240
dat <- data.frame(
  habitat = factor(rep(c("forest", "grassland"), each = n / 2)),
  temperature = rnorm(n)
)

mu <- 8 + 1.2 * (dat$habitat == "grassland") + 0.6 * dat$temperature
sigma_true <- exp(log(0.6) + 0.55 * (dat$habitat == "grassland"))
dat$growth <- rnorm(n, mean = mu, sd = sigma_true)

Fit the location-scale model:

fit_growth <- drmTMB(
  drm_formula(growth ~ habitat + temperature, sigma ~ habitat),
  family = gaussian(),
  data = dat
)

Before interpreting coefficients, check the fitted object:

check_drm(fit_growth)
#> <drm_check: 10 checks>
#> ok: 10; notes: 0; warnings: 0; errors: 0
#>                      check status
#>      optimizer_convergence     ok
#>           optimizer_budget     ok
#>           finite_objective     ok
#>             fixed_gradient     ok
#>            sdreport_status     ok
#>  hessian_positive_definite     ok
#>     standard_errors_finite     ok
#>               dropped_rows     ok
#>             positive_scale     ok
#>   fixed_effect_design_size     ok
#>                                                                                   value
#>                                                                                       0
#>                                                 iterations=22; function=32; gradient=23
#>                                                                                   279.3
#>                                                     max=0.0002960; component=beta_mu[2]
#>                                                                                      ok
#>                                                                                    TRUE
#>                                                                  range=[0.04551,0.1102]
#>                                                                     nobs=240; dropped=0
#>                                                                              min=0.5633
#>  total_mb=0.04107; max_cols=3; largest=mu; largest_class=matrix; largest_density=0.8333
#>                                                                              message
#>                                                        nlminb convergence code is 0.
#>  Optimizer evaluation counts recorded; no eval.max or iter.max control was supplied.
#>                                             Objective and log-likelihood are finite.
#>        Maximum absolute fixed gradient is <= 0.001; largest component is beta_mu[2].
#>                                              TMB::sdreport() completed successfully.
#>                                        sdreport reports a positive-definite Hessian.
#>                                         All fixed-effect standard errors are finite.
#>                   No rows were dropped by model-frame or known-covariance filtering.
#>                                     All fitted scale values are finite and positive.
#>                          Dense fixed-effect design matrices are modest for this fit.

The interval target inventory is part of the same interpretation gate. For this fixed-effect example, every coefficient is a direct profile target:

profile_targets(fit_growth)[
  ,
  c("parm", "estimate", "profile_ready", "profile_note")
]
#>                           parm   estimate profile_ready profile_note
#> 1         fixef:mu:(Intercept)  7.9948257          TRUE        ready
#> 2    fixef:mu:habitatgrassland  1.2068544          TRUE        ready
#> 3         fixef:mu:temperature  0.5745450          TRUE        ready
#> 4      fixef:sigma:(Intercept) -0.5740181          TRUE        ready
#> 5 fixef:sigma:habitatgrassland  0.6376415          TRUE        ready

Now print the coefficient table:

summary(fit_growth)
#> <summary.drmTMB>
#>                          estimate  std_error
#> mu:(Intercept)          7.9948257 0.05143586
#> mu:habitatgrassland     1.2068544 0.11021152
#> mu:temperature          0.5745450 0.04550931
#> sigma:(Intercept)      -0.5740181 0.06466077
#> sigma:habitatgrassland  0.6376415 0.09160079
#> Distributional, random-effect, scale, and correlation parameters:
#>                         component  dpar         term  estimate   minimum
#> fitted:sigma distributional-scale sigma fitted range 0.8144743 0.5632576
#>               maximum    scale
#> fitted:sigma 1.065691 response
#> logLik: -279.3
#> convergence: 0

How to read this output:

Rows beginning with mu: are mean-growth effects on the response scale. For example, mu:temperature is the expected change in mean growth for a one-unit increase in temperature.
Rows beginning with sigma: are log-residual-standard-deviation effects. For example, sigma:habitatgrassland is not an additive change in growth; it is a log-scale change in residual variability.
Exponentiating a sigma coefficient gives the multiplicative change in residual standard deviation.

sigma_habitat <- coef(fit_growth, "sigma")["habitatgrassland"]
data.frame(
  coefficient = sigma_habitat,
  residual_sd_ratio = exp(sigma_habitat),
  residual_variance_ratio = exp(2 * sigma_habitat)
)
#>                  coefficient residual_sd_ratio residual_variance_ratio
#> habitatgrassland   0.6376415          1.892013                3.579714

In this fitted example, exp(sigma:habitatgrassland) is the estimated ratio of grassland residual SD to forest residual SD. A value near 2 means that grassland has about twice the residual SD of forest after accounting for mean habitat and temperature effects. Because residual variance is SD squared, the same fitted coefficient implies about four times the residual variance.

It is often clearer to report the fitted residual SDs and variances directly:

newdat <- data.frame(
  habitat = factor(c("forest", "grassland"), levels = levels(dat$habitat)),
  temperature = 0
)

growth_report <- data.frame(
  habitat = newdat$habitat,
  fitted_mean_growth = predict(fit_growth, newdata = newdat, dpar = "mu"),
  fitted_residual_sd = predict(fit_growth, newdata = newdat, dpar = "sigma")
)
growth_report$fitted_residual_variance <- growth_report$fitted_residual_sd^2
growth_report
#>     habitat fitted_mean_growth fitted_residual_sd fitted_residual_variance
#> 1    forest           7.994826          0.5632576                0.3172592
#> 2 grassland           9.201680          1.0656910                1.1356972

This table maps the model back to the scientific question. The mean column summarises the location model. The residual SD column summarises the scale model in the units of growth. The residual variance column is the variance-facing version of the same fitted Gaussian scale model.

The compact translation table below is the reporting layer Pat should be able to read without returning to the equations:

growth_translation <- data.frame(
  model_piece = c(
    "fixed mean slope",
    "fixed residual-SD contrast",
    "fixed residual-variance contrast"
  ),
  fitted_term = c(
    "mu:temperature",
    "exp(sigma:habitatgrassland)",
    "exp(2 * sigma:habitatgrassland)"
  ),
  response_scale_value = c(
    unname(coef(fit_growth, "mu")["temperature"]),
    unname(exp(sigma_habitat)),
    unname(exp(2 * sigma_habitat))
  ),
  interpretation = c(
    "mean growth change per one-unit temperature increase",
    "grassland residual SD divided by forest residual SD",
    "grassland residual variance divided by forest residual variance"
  )
)
growth_translation$response_scale_value <-
  round(growth_translation$response_scale_value, 3)
growth_translation
#>                        model_piece                     fitted_term
#> 1                 fixed mean slope                  mu:temperature
#> 2       fixed residual-SD contrast     exp(sigma:habitatgrassland)
#> 3 fixed residual-variance contrast exp(2 * sigma:habitatgrassland)
#>   response_scale_value
#> 1                0.575
#> 2                1.892
#> 3                3.580
#>                                                    interpretation
#> 1            mean growth change per one-unit temperature increase
#> 2             grassland residual SD divided by forest residual SD
#> 3 grassland residual variance divided by forest residual variance

A report can now say three concrete things. Mean growth increases by the fitted mu:temperature slope for each one-unit temperature increase. Grassland has the fitted residual-SD ratio in the second row after accounting for mean habitat and temperature effects. Because Gaussian variance is sigma^2, the residual variance contrast is the third row. If that third row is larger than one, individual growth is less predictable in grassland.

The same reporting discipline carries over to hierarchical versions of the question:

If the biological question is…	Fit this kind of term	Report this quantity	Check before reporting
Do populations differ in thermal plasticity?	`(0 + temperature \| population)` or `(1 + temperature \| population)` in the `mu` formula	the random-slope SD, in growth-per-temperature units	`check_drm(fit)` and the matching `sd:mu:...temperature...` row in `profile_targets(fit)`
Does predictability change with temperature?	`sigma ~ temperature`	`exp(gamma_temperature)` as the residual-SD ratio, and `exp(2 * gamma_temperature)` when the paper talks about variance	`check_drm(fit)` and the `fixef:sigma:temperature` row in `profile_targets(fit)`
Do habitats differ in among-population variation?	`(1 \| population)` plus `sd(population) ~ habitat`	`exp(alpha_habitat)` as the ratio of among-population SDs	`check_drm(fit)` and the `fixef:sd(population):...` row in `profile_targets(fit)`

These three rows answer different biological questions. sigma ~ temperature models residual variation among observations. (0 + temperature | population) models population-to-population differences in the mean slope. sd(population) ~ habitat models the size of among-population mean differences.

Curved responses and interactions

Ecological responses are often curved rather than straight-line. Growth may increase from cold to moderate temperatures and then decline at high temperatures. A second-order term is usually enough for this kind of question; third-order polynomials are harder to explain and should be used only when the question and data justify them.

The symbolic model below asks whether mean growth has a quadratic temperature response and whether the temperature slope differs between habitats. It also asks whether residual variability changes along the same temperature gradient:

$\begin{aligned} \text{growth}_i &\sim \operatorname{Normal}(\mu_i, \sigma_i^2), \\ \mu_i &= \beta_0 + \beta_1 \text{temperature}_i + \beta_2 \text{temperature}_i^2 + \beta_3 \mathbb{1}(\text{habitat}_i = \text{grassland}) + \beta_4 \text{temperature}_i \mathbb{1}(\text{habitat}_i = \text{grassland}), \\ \log(\sigma_i) &= \gamma_0 + \gamma_1 \text{temperature}_i^2. \end{aligned}$

The matching R formula uses ordinary R formula syntax:

drmTMB(
  drm_formula(
    growth ~ habitat * temperature + I(temperature^2),
    sigma ~ I(temperature^2)
  ),
  family = gaussian(),
  data = dat_curve
)

The term habitat * temperature expands to the two main effects plus their interaction. The term I(temperature^2) tells R to square temperature before building the design matrix. The alternative poly(temperature, 2) is also supported and can be numerically useful, but its coefficients are basis coefficients; for applied interpretation, predicted means and residual SDs are usually clearer than the raw polynomial coefficients.

set.seed(43)
n_curve <- 260
dat_curve <- data.frame(
  habitat = factor(rep(c("forest", "grassland"), each = n_curve / 2)),
  temperature = runif(n_curve, -2, 2)
)

mu_curve <- 8 +
  0.6 * dat_curve$temperature -
  0.45 * dat_curve$temperature^2 +
  0.8 * (dat_curve$habitat == "grassland") +
  0.35 * dat_curve$temperature * (dat_curve$habitat == "grassland")
sigma_curve <- exp(log(0.55) + 0.18 * dat_curve$temperature^2)
dat_curve$growth <- rnorm(n_curve, mean = mu_curve, sd = sigma_curve)

fit_curve <- drmTMB(
  drm_formula(
    growth ~ habitat * temperature + I(temperature^2),
    sigma ~ I(temperature^2)
  ),
  family = gaussian(),
  data = dat_curve
)

summary(fit_curve)
#> <summary.drmTMB>
#>                                   estimate  std_error
#> mu:(Intercept)                   7.9939526 0.06466906
#> mu:habitatgrassland              0.7803419 0.08215938
#> mu:temperature                   0.7613439 0.06610385
#> mu:I(temperature^2)             -0.4228598 0.04308159
#> mu:habitatgrassland:temperature  0.1347556 0.09137770
#> sigma:(Intercept)               -0.6854670 0.06644116
#> sigma:I(temperature^2)           0.2402096 0.03532779
#> Distributional, random-effect, scale, and correlation parameters:
#>                         component  dpar         term  estimate   minimum
#> fitted:sigma distributional-scale sigma fitted range 0.7413125 0.5038625
#>               maximum    scale
#> fitted:sigma 1.299951 response
#> logLik: -278.9
#> convergence: 0

For a curved model, the most useful first output is often a prediction table. Here we compare cold, moderate, and warm temperatures in the two habitats:

curve_grid <- expand.grid(
  habitat = levels(dat_curve$habitat),
  temperature = c(-1.5, 0, 1.5)
)

data.frame(
  curve_grid,
  fitted_mean_growth = predict(fit_curve, newdata = curve_grid, dpar = "mu"),
  fitted_residual_sd = predict(fit_curve, newdata = curve_grid, dpar = "sigma")
)
#>     habitat temperature fitted_mean_growth fitted_residual_sd
#> 1    forest        -1.5           5.900502          0.8650263
#> 2 grassland        -1.5           6.478711          0.8650263
#> 3    forest         0.0           7.993953          0.5038549
#> 4 grassland         0.0           8.774295          0.5038549
#> 5    forest         1.5           8.184534          0.8650263
#> 6 grassland         1.5           9.167009          0.8650263

This table is easier to explain than the individual polynomial coefficients. The fitted mean column describes the location curve: expected growth is highest near the middle of the temperature range in this simulated example. The fitted residual SD column describes the scale curve: growth is less predictable at temperature extremes when the fitted residual SD is larger.

For several predictors, base R interaction expansions work in the same way. For example, (temperature + moisture + canopy)^2 expands to all three main effects and all pairwise interactions. That formula is often easier to read than writing every interaction by hand, but it should still be tied to a clear biological question.

If the sigma formula is omitted, drmTMB fits an intercept-only residual standard deviation:

fit0 <- drmTMB(
  drm_formula(growth ~ habitat + temperature),
  family = gaussian(),
  data = dat
)

coef(fit0, "sigma")
#> (Intercept) 
#>  -0.1606356

That simpler model can still estimate habitat and temperature effects on mean growth, but it assumes one residual SD for all observations.

Meta-analysis as Gaussian regression

Meta-analysis is the same Gaussian regression idea with known sampling covariance added to the likelihood. For diagonal known sampling variance:

$\begin{aligned} y_i \mid \mu_i, \sigma_i, v_i &\sim \operatorname{Normal}(\mu_i, v_i + \sigma_i^2),\\ \mu_i &= X_{\mu}[i, ]\beta_{\mu},\\ \sigma_i &= \exp(X_{\sigma}[i, ]\beta_{\sigma}). \end{aligned}$

with matching syntax:

drmTMB(
  bf(yi ~ x1 + meta_known_V(V = vi), sigma ~ x1),
  family = gaussian(),
  data = dat
)

The canonical marker is meta_known_V(V = V); in concrete code, V can be a vector such as vi. Here vi is known sampling variance. The estimated sigma_i is the additional heterogeneity standard deviation after known sampling error has been included. In meta-analysis prose this quantity is often called tau, but the drmTMB API keeps the name sigma because it is still the residual scale parameter of the Gaussian regression.

When V is a full known covariance matrix, the same model is:

$y \mid \mu, \sigma, V \sim \operatorname{MVN}\left(\mu, V + \operatorname{diag}(\sigma_i^2)\right).$

Current implementation caveats

The current fitter supports family = gaussian() with fixed effects for mu and sigma, plus random intercepts, independent numeric random slopes, and labelled or unlabelled ordinary correlated intercept-slope blocks in mu.
Residual-scale random intercepts and independent random slopes in sigma, such as sigma ~ x2 + (1 | id) and sigma ~ x2 + (0 + w | id), are implemented for univariate Gaussian models.
The sigma formula is the observation-level residual standard deviation. It is not a random-effect scale component.
Missing rows are dropped when they are incomplete for variables used by either the mu or sigma formula.
Diagonal and dense full known sampling covariance via meta_known_V(V = V) is supported for Gaussian meta-analysis.
Fixed-effect bivariate Gaussian rho12 models are implemented in the bivariate coscale article.
sd(group) random-effect scale formulae are implemented for one or more distinct unlabelled Gaussian mu random intercepts, with group-level predictors that are constant within the named grouping variable.
Correlated residual-scale slope blocks, slope-specific sd() targets, labelled-block sd() targets, cross-formula labelled covariance sharing beyond the first random-intercept slice, phylogenetic slopes, standalone or partial phylogenetic scale terms, spatial sigma, bivariate spatial covariance, mesh/SPDE spatial fields, multiple spatial slopes, and non-Gaussian random effects are planned but not implemented in this Gaussian tutorial.
Intercept-only phylogenetic random effects are implemented in univariate Gaussian mu formulas as phylo(1 | species, tree = tree).
Coordinate-spatial random effects are implemented in univariate Gaussian mu formulas as spatial(1 | site, coords = coords) and one numeric spatial(1 + x | site, coords = coords) slope.

Multiple random-effect scale components use explicit sd() formulas, not extra residual sigma formulas:

drmTMB(
  bf(
    y ~ x1 + (1 | study) + (1 | species),
    sigma ~ x1,
    sd(study) ~ x1,
    sd(species) ~ 1
  ),
  family = gaussian(),
  data = dat
)