Robust continuous responses • drmTMB

Continuous biological and environmental measurements, such as seedling height growth in centimetres, often contain a few large residuals. They may be real rare events, short-term stress responses, field measurement problems, or individuals that experienced unmeasured conditions. A Gaussian location-scale model can still be useful, but a Student-t model asks a slightly different question: are the location and scale patterns stable when the likelihood allows heavier tails?

This article introduces the first implemented robust continuous family in drmTMB: fixed-effect Student-t location-scale-shape regression. Here location means the expected response parameter mu, scale means the positive Student-t core scale sigma, and shape means the tail parameter nu. This one-response model has no coscale term; in drmTMB, coscale means modelling residual correlation, such as bivariate rho12.

Model equation and R syntax

The Student-t model has three distributional parameters:

$\begin{aligned} \text{growth}_i \mid \mu_i, \sigma_i, \nu_i &\sim \operatorname{Student\text{-}t}(\mu_i, \sigma_i, \nu_i),\\ \mu_i &= \beta_0 + \beta_1 \text{dry}_i,\\ \log(\sigma_i) &= \gamma_0 + \gamma_1 \text{dry}_i,\\ \nu_i &= 2 + \exp(\delta_0). \end{aligned}$

The matching drmTMB syntax is:

drmTMB(
  bf(growth ~ drought, sigma ~ drought, nu ~ 1),
  family = student(),
  data = seedlings
)

Here mu is the expected response, sigma is the Student-t scale parameter, and nu is the tail-shape parameter. When nu > 2, the residual standard deviation is sigma * sqrt(nu / (nu - 2)), so sigma should not be read as the exact residual SD unless nu is large. Smaller nu means heavier tails. Large nu means the Student-t likelihood is close to Gaussian. In drmTMB,

$\nu_i = 2 + \exp(\eta_{\nu i}).$

so fitted nu values stay above 2. This keeps the fitted Student-t distribution in the finite-variance region.

A seedling growth example

Suppose seedlings are grown under ambient and dry conditions, and growth is measured as height increase in centimetres. We expect drought to reduce average growth, and we also allow drought to change scale among seedlings.

library(drmTMB)
#> 
#> Attaching package: 'drmTMB'
#> The following object is masked from 'package:base':
#> 
#>     beta

set.seed(101)
n <- 180
seedlings <- data.frame(
  drought = factor(rep(c("ambient", "dry"), each = n / 2))
)
dry <- as.numeric(seedlings$drought == "dry")
mu <- 1.2 - 0.45 * dry
sigma <- exp(-1 + 0.35 * dry)
seedlings$growth <- mu + sigma * rt(n, df = 5)

In the equation above, dry_i is the model-matrix indicator for the dry treatment level. It corresponds to the droughtdry coefficient printed by R.

A Gaussian location-scale model is a useful baseline:

fit_gaussian <- drmTMB(
  bf(growth ~ drought, sigma ~ drought),
  family = gaussian(),
  data = seedlings
)

The robust version keeps the same location and scale formulas, then adds a formula for nu:

fit_student <- drmTMB(
  bf(growth ~ drought, sigma ~ drought, nu ~ 1),
  family = student(),
  data = seedlings
)

The extra line nu ~ 1 says that the tail-shape parameter is estimated as a constant across observations. Predictor-varying fixed-effect nu formulas are also supported. For example, nu ~ drought asks whether the residual tails are heavier in the dry treatment than in the ambient treatment after mu and sigma have been modelled. This first tutorial keeps nu constant so the comparison focuses on location, scale, and residual-tail assumptions.

Check the fitted robust model

Run check_drm() before interpreting the Student-t coefficients:

check_drm(fit_student)
#> <drm_check: 11 checks>
#> ok: 11; notes: 0; warnings: 0; errors: 0
#>                      check status
#>      optimizer_convergence     ok
#>           optimizer_budget     ok
#>           finite_objective     ok
#>             fixed_gradient     ok
#>            sdreport_status     ok
#>  hessian_positive_definite     ok
#>     standard_errors_finite     ok
#>               dropped_rows     ok
#>             positive_scale     ok
#>                 student_nu     ok
#>   fixed_effect_design_size     ok
#>                                                                                   value
#>                                                                                       0
#>                                                 iterations=24; function=38; gradient=25
#>                                                                                   144.6
#>                                                    max=0.00001001; component=beta_mu[2]
#>                                                                                      ok
#>                                                                                    TRUE
#>                                                                  range=[0.04836,0.9790]
#>                                                                     nobs=180; dropped=0
#>                                                                              min=0.4116
#>                                                                     range=[9.615,9.615]
#>  total_mb=0.04302; max_cols=2; largest=mu; largest_class=matrix; largest_density=0.7500
#>                                                                              message
#>                                                        nlminb convergence code is 0.
#>  Optimizer evaluation counts recorded; no eval.max or iter.max control was supplied.
#>                                             Objective and log-likelihood are finite.
#>        Maximum absolute fixed gradient is <= 0.001; largest component is beta_mu[2].
#>                                              TMB::sdreport() completed successfully.
#>                                        sdreport reports a positive-definite Hessian.
#>                                         All fixed-effect standard errors are finite.
#>                   No rows were dropped by model-frame or known-covariance filtering.
#>                                     All fitted scale values are finite and positive.
#>               All fitted Student-t nu values are finite and above the boundary at 2.
#>                          Dense fixed-effect design matrices are modest for this fit.

The student_nu row inspects fitted response-scale nu values. A near-boundary warning means the tail-shape parameter is close to 2, where the finite-variance constraint matters. A large-nu note means the Student-t fit may be close to a Gaussian fit. If nu is large, report that the Student-t fit is close to Gaussian and check whether the Gaussian and Student-t mu and sigma conclusions differ. If nu is near the boundary, inspect influential observations, compare Gaussian and Student-t conclusions, and report that the fitted tail-shape parameter is near the lower bound.

Interpret coefficients by parameter

Location and scale coefficients use the same formula grammar as Gaussian location-scale models:

coef(fit_student, "mu")
#> (Intercept)  droughtdry 
#>   1.2084534  -0.4493228
coef(fit_student, "sigma")
#> (Intercept)  droughtdry 
#>  -0.8877247   0.3309563
coef(fit_student, "nu")
#> (Intercept) 
#>    2.030158

The mu coefficient for droughtdry estimates the drought difference in expected growth. The sigma coefficient for droughtdry is on the log Student-t scale-parameter scale: a positive value means the fitted scale is larger under drought. The nu coefficient is on the link scale for nu = 2 + exp(eta_nu). Use prediction to read nu on the response scale:

head(predict(fit_student, dpar = "nu"))
#> [1] 9.615291 9.615291 9.615291 9.615291 9.615291 9.615291

Compare with the Gaussian model

For this simulated example, the two models share mu and sigma formulas: does drought reduce growth, and does it change scale? They differ in the residual-tail assumption.

AIC(fit_gaussian, fit_student)
#>              df      AIC
#> fit_gaussian  4 299.2362
#> fit_student   5 299.1516

AIC is only one summary. The more important habit is to inspect whether the scientific conclusions about mu and sigma are sensitive to the residual tail assumption.

coef(fit_gaussian, "mu")
#> (Intercept)  droughtdry 
#>   1.2242787  -0.4899041
coef(fit_student, "mu")
#> (Intercept)  droughtdry 
#>   1.2084534  -0.4493228

coef(fit_gaussian, "sigma")
#> (Intercept)  droughtdry 
#>  -0.7835855   0.3472741
coef(fit_student, "sigma")
#> (Intercept)  droughtdry 
#>  -0.8877247   0.3309563

If the mu and sigma conclusions change strongly, the large residuals are not a side detail. They are part of the distributional story and should be reported.

Current boundary

The implemented Student-t path is intentionally narrow:

drmTMB(
  bf(y ~ x1, sigma ~ x2, nu ~ x3),
  family = student(),
  data = dat
)

Random effects, known sampling covariance, phylogenetic terms, spatial terms, and bivariate Student-t models are later phases. Start with this fixed-effect path when you need a robust continuous comparison to a Gaussian location-scale model.

Skew-normal residual asymmetry is a separate planned family, not a fitted option today:

# Planned, not fitted yet:
drmTMB(
  bf(y ~ x1, sigma ~ x2, nu ~ x3),
  family = skew_normal(),
  data = dat
)

Use this planned syntax only in design notes until skew_normal() has a density implementation, a normal-limit check at nu = 0, recovery for positive and negative skew, and tests showing that nu ~ x is not merely absorbing heteroscedasticity or outliers. For now, fit the Gaussian and Student-t models above and report whether the location and scale conclusions are robust to heavier tails. ID-level skewness such as future skew(id) ~ x is a different latent-effect question, not an alias for residual Student-t nu or planned skew-normal nu.