Mean effects and residual heterogeneity in meta-analysis • drmTMB

Meta-analysis in drmTMB is ordinary Gaussian distributional regression with known sampling covariance. The sampling covariance is supplied with meta_V(V = V) inside the location formula; it is not a separate meta_gaussian() family. meta_known_V(V = V) remains a compatibility alias for older examples.

The distinction matters. A meta-analysis usually has two sources of variation: known sampling error from the original studies, and unknown heterogeneity that the meta-analytic model estimates. In drmTMB, known sampling covariance is V, and unknown residual heterogeneity is sigma.

Throughout this article, Normal(a, b) and MVN(a, B) use variance or covariance as the second argument.

The source logic mirrors recent location-scale meta-analysis papers and notes: Nakagawa et al. (2025) frame heterogeneity as a biological and methodological response, Yang and Nakagawa’s distributional-regression manuscript treats location, scale, and shape as modelled distributional parameters, Rodriguez et al. (2023) argue that categorical moderators should usually allow unequal between-study heterogeneity, and Nakagawa’s 2022 unifying note separates ordinary row weights from proportional sampling-variance components. The fitted syntax below stays inside the currently implemented drmTMB surface.

The model

For a univariate meta-analysis with independent sampling errors, the model is:

$y_i \mid \mu_i, \sigma_i, v_i \sim \operatorname{Normal}(\mu_i, v_i + \sigma_i^2)$

$\mu_i = \mathbf{x}_{\mu i}^{\top}\boldsymbol{\beta}_{\mu}, \qquad \log(\sigma_i) = \mathbf{x}_{\sigma i}^{\top}\boldsymbol{\beta}_{\sigma}.$

The matching R syntax is:

drmTMB(
  drm_formula(
    yi ~ x1 + x2 + meta_V(V = vi),
    sigma ~ x1
  ),
  family = gaussian(),
  data = dat
)

Here vi contains known sampling variances. If a dataset contains standard errors, square them before fitting. The sigma formula models the extra heterogeneity left after the known sampling variance has been included. In meta-analysis papers this extra heterogeneity SD is often written as tau; drmTMB keeps the public parameter name sigma so that meta-analysis uses the same distributional grammar as other Gaussian models.

Read each symbol before interpreting coefficients:

Symbol or syntax	Meaning in a meta-analysis	In the restoration example
$y_i$ , `yi`	observed effect size for effect-size row $i$	standardized biodiversity response to restoration
$v_i$ , `vi`	known sampling variance for $y_i$ , treated as known before fitting	uncertainty from the primary restoration study
$\mathbf{V}$ , `V`	known sampling covariance matrix when effect sizes are correlated	a dense or block-diagonal covariance matrix for correlated effect sizes
$\mu_i$ , `mu`	expected effect size after location moderators	mean restoration effect for a habitat and duration value
$\boldsymbol{\beta}_{\mu}$	location coefficients in the mean model	average forest-grassland and duration contrasts
$\sigma_i$ , `sigma`	fitted extra residual heterogeneity SD after known sampling variance is included	remaining among-effect-size unpredictability
$\boldsymbol{\beta}_{\sigma}$	log-SD coefficients for extra heterogeneity	habitat effects on residual heterogeneity, reported as `exp(coef)` SD ratios
`sd(study)`	study-level random-effect SD, when repeated effect sizes are grouped within studies	among-study heterogeneity distinct from residual `sigma`
`weights = w`	ordinary row log-likelihood multiplier	not a substitute for known sampling variance

For correlated effect sizes, the same idea is written with a full known sampling covariance matrix:

$\mathbf{y} \sim \operatorname{MVN} \left( \boldsymbol{\mu}, \mathbf{V} + \operatorname{diag}(\sigma_1^2,\ldots,\sigma_n^2) \right).$

The right-hand side of meta_V(V = V) may be a variance column, a vector, a diagonal matrix, a dense block-diagonal matrix, or a dense full covariance matrix.

Dense means dense. If V is supplied as a full matrix, drmTMB stores the retained covariance as a dense R matrix. This is useful for small to moderate correlated-effect examples and for likelihood-comparator checks, but it is not the large-meta-analysis storage route. Large block-diagonal or sparse known covariance structures need a future sparse or block-sparse path before they should be treated as scalable.

Worked example: restoration effects

Suppose we are synthesising effect sizes from studies that compare ecological restoration with a control condition. The response yi is a standardised effect size for biodiversity. The known sampling variance vi comes from each primary study. We ask two questions:

Is the mean restoration effect different in forests and grasslands?
Is the unexplained among-effect-size heterogeneity larger in one habitat?

This simulated dataset stands in for the structure of a real ecological meta-analysis:

set.seed(20)
n <- 80
dat <- data.frame(
  study = factor(seq_len(n)),
  habitat = factor(
    rep(c("grassland", "forest"), length.out = n),
    levels = c("grassland", "forest")
  ),
  duration = scale(runif(n, 1, 8))[, 1]
)

dat$vi <- runif(n, 0.015, 0.08)

mu <- 0.12 + 0.22 * (dat$habitat == "forest") + 0.08 * dat$duration
sigma_extra <- exp(-1.65 + 0.55 * (dat$habitat == "forest"))

dat$yi <- rnorm(n, mean = mu, sd = sqrt(dat$vi + sigma_extra^2))
head(dat)
#>   study   habitat   duration         vi         yi
#> 1     1 grassland  1.4380023 0.02842515 0.17351093
#> 2     2    forest  1.0691520 0.07322927 1.25000784
#> 3     3 grassland -0.5877070 0.01899541 0.09594502
#> 4     4    forest  0.2590503 0.07619107 0.40060229
#> 5     5 grassland  1.7269741 0.06120553 0.31720693
#> 6     6    forest  1.7860222 0.04391420 0.23201838

The fitted model mirrors the equations above:

fit_meta <- drmTMB(
  drm_formula(
    yi ~ habitat + duration + meta_V(V = vi),
    sigma ~ habitat
  ),
  family = gaussian(),
  data = dat
)

summary(fit_meta)
#> <summary.drmTMB>
#>                       estimate  std_error
#> mu:(Intercept)       0.1455573 0.03952995
#> mu:habitatforest     0.2622841 0.07954687
#> mu:duration          0.1033286 0.03595271
#> sigma:(Intercept)   -1.9804840 0.34097290
#> sigma:habitatforest  1.0093575 0.37666913
#> Distributional, random-effect, scale, and correlation parameters:
#>                         component  dpar         term  estimate   minimum
#> fitted:sigma distributional-scale sigma fitted range 0.2583293 0.1380024
#>                maximum    scale
#> fitted:sigma 0.3786562 response
#> logLik: -24.77
#> convergence: 0

The mu rows describe the average effect-size model. In this simulation the forest coefficient is positive, so the model estimates a larger mean restoration effect for forests than for grasslands, after accounting for study duration and known sampling variance.

The sigma rows are on the log-SD scale. They describe extra heterogeneity, not sampling error. The forest contrast can be read as a multiplicative change in the residual heterogeneity SD:

sigma_coef <- coef(fit_meta, "sigma")
sigma_coef
#>   (Intercept) habitatforest 
#>     -1.980484      1.009357
exp(sigma_coef["habitatforest"])
#> habitatforest 
#>      2.743838

In this run, forest studies have about 2.74 times the extra heterogeneity SD of grassland studies. That multiplier applies to sigma_i; it does not alter the known sampling variances vi.

Categorical moderators and heterogeneous heterogeneity

Rodriguez et al. (2023) recommend treating unequal between-study heterogeneity as the default when testing categorical moderators. In drmTMB, the current Gaussian meta-analysis route expresses that idea with a sigma formula rather than a new tau ~ syntax. For the two-habitat example above,

$\log(\sigma_i) = \gamma_0 + \gamma_1 \operatorname{forest}_i,$

so the fitted extra heterogeneity SDs are

$\sigma_{\mathrm{grassland}} = \exp(\gamma_0), \qquad \sigma_{\mathrm{forest}} = \exp(\gamma_0 + \gamma_1).$

The scale coefficient $\gamma_1$ is therefore not another mean effect. It is the log ratio of forest to grassland residual heterogeneity SD:

$\frac{\sigma_{\mathrm{forest}}} {\sigma_{\mathrm{grassland}}} = \exp(\gamma_1).$

If the report uses variance-facing meta-analysis language, the corresponding extra heterogeneity variance ratio is:

$\frac{\sigma_{\mathrm{forest}}^2} {\sigma_{\mathrm{grassland}}^2} = \exp(2\gamma_1).$

Biologically, the mean forest coefficient asks whether restoration has a larger average biodiversity effect in forests. The sigma forest coefficient asks a different question: are forest restoration effects less predictable across studies after known sampling error and duration are accounted for? Those two answers can point in the same direction, opposite directions, or only one can be important.

It is often helpful to translate fitted values back to the response scale and to the variance scale used in many meta-analysis summaries:

meta_report <- data.frame(
  habitat = dat$habitat,
  known_sampling_variance = dat$vi,
  fitted_extra_heterogeneity_sd = sigma(fit_meta)
)
meta_report$fitted_extra_heterogeneity_variance <-
  meta_report$fitted_extra_heterogeneity_sd^2
meta_report$total_observation_variance <-
  meta_report$known_sampling_variance +
  meta_report$fitted_extra_heterogeneity_variance

meta_by_habitat <- aggregate(
  meta_report[c(
    "fitted_extra_heterogeneity_sd",
    "fitted_extra_heterogeneity_variance",
    "total_observation_variance"
  )],
  list(habitat = meta_report$habitat),
  mean
)
meta_by_habitat
#>     habitat fitted_extra_heterogeneity_sd fitted_extra_heterogeneity_variance
#> 1 grassland                     0.1380024                          0.01904467
#> 2    forest                     0.3786562                          0.14338053
#>   total_observation_variance
#> 1                 0.06715651
#> 2                 0.19328933

For this model, the fitted residual heterogeneity SD is larger in forest studies. The heterogeneity variance column is the same fitted sigma model reported as sigma^2. The total observation variance column adds the known sampling variance vi, so it is larger than the estimated heterogeneity variance alone. A biological interpretation would be: restoration effects are not only larger on average in forests, they are also less predictable across forest studies after known sampling uncertainty is accounted for.

Choose the report scale deliberately:

Quantity	Computation	Interpretation
known sampling variance	`vi` or the supplied `V`	uncertainty known before fitting
extra heterogeneity SD	`sigma(fit_meta)`	residual between-effect-size SD after known sampling variance is included
extra heterogeneity variance	`sigma(fit_meta)^2`	variance-scale version of the fitted `sigma` model
total observation variance	`vi + sigma(fit_meta)^2` for diagonal `V`	sampling variance plus estimated extra heterogeneity

This is why drmTMB keeps the public parameter name sigma: it is the fitted extra heterogeneity scale in the Gaussian likelihood. If a meta-analysis report needs the usual variance-facing language, square sigma and say that the reported column is sigma^2.

A quick diagnostic check flags optimizer, Hessian, row-filtering, scale, and known sampling covariance problems before interpretation:

check_drm(fit_meta)
#> <drm_check: 11 checks>
#> ok: 11; notes: 0; warnings: 0; errors: 0
#>                      check status
#>      optimizer_convergence     ok
#>           optimizer_budget     ok
#>           finite_objective     ok
#>             fixed_gradient     ok
#>            sdreport_status     ok
#>  hessian_positive_definite     ok
#>     standard_errors_finite     ok
#>               dropped_rows     ok
#>             positive_scale     ok
#>  known_sampling_covariance     ok
#>   fixed_effect_design_size     ok
#>                                                                                   value
#>                                                                                       0
#>                                                 iterations=31; function=43; gradient=32
#>                                                                                   24.77
#>                                                     max=0.0001424; component=beta_mu[3]
#>                                                                                      ok
#>                                                                                    TRUE
#>                                                                  range=[0.03595,0.3767]
#>                                                                      nobs=80; dropped=0
#>                                                                              min=0.1380
#>                                            type=diagonal; n=80; range=[0.01520,0.07979]
#>  total_mb=0.01540; max_cols=3; largest=mu; largest_class=matrix; largest_density=0.8333
#>                                                                              message
#>                                                        nlminb convergence code is 0.
#>  Optimizer evaluation counts recorded; no eval.max or iter.max control was supplied.
#>                                             Objective and log-likelihood are finite.
#>        Maximum absolute fixed gradient is <= 0.001; largest component is beta_mu[3].
#>                                              TMB::sdreport() completed successfully.
#>                                        sdreport reports a positive-definite Hessian.
#>                                         All fixed-effect standard errors are finite.
#>                   No rows were dropped by model-frame or known-covariance filtering.
#>                                     All fitted scale values are finite and positive.
#>                   Known sampling covariance is recorded through meta_known_V(V = V).
#>                          Dense fixed-effect design matrices are modest for this fit.

For full-matrix known covariance fits, the known_sampling_covariance row is a note rather than a problem by itself. It reports the retained matrix dimension, dense storage, density, size, rank, and conditioning so you can see whether the current dense route still matches the scale of the analysis.

Weights are not sampling variances

The top-level weights = argument and meta_V(V = V) answer different questions. Ordinary likelihood weights multiply each observation’s log-likelihood contribution:

$\ell(\theta) = \sum_i w_i \ell_i(\theta).$

Known sampling covariance changes the covariance model:

$\mathbf{y} \sim \operatorname{MVN} \left( \boldsymbol{\mu}, \mathbf{V} + \boldsymbol{\Omega}_{\mathrm{estimated}} \right).$

Thus weights = 1 / vi is not the same model as meta_V(V = vi). Use meta_V(V = vi) when vi is known sampling variance. Use weights = only when the intended analysis is a weighted likelihood, such as case weights or externally defined row weights. The practical consequence is that weights = 1 / vi changes how much each row contributes to the likelihood; it does not put vi into the modelled sampling variance. The current dense full-V path rejects non-unit weights until joint-block weighting has a separate design.

A proportional sampling-variance model, for example a term such as $\pi_i \sim \operatorname{Normal}(0, \phi_\pi / w_i)$ , is a different model again. It is not implemented in drmTMB yet and should not be mimicked with the top-level weights = argument. The preferred additive known-V spelling is meta_V(V = V); the proportional branch remains design-only:

# Preferred additive known-V spelling.
meta_V(V = V)

# Future design only; this call is not implemented.
meta_V(w = w, scale = "proportional")

meta_known_V(V = V) remains a compatibility alias for the additive known-V form, not a second likelihood path. The response is already on the left-hand side of the model formula, so meta_V() should not repeat it as a positional argument. Current fitted examples should use meta_V(V = V) for known sampling covariance and weights = only for likelihood weights.

Known sampling variance plus group-level heterogeneity

Some meta-analyses contain multiple effect sizes per study, species, lab, or site. Then sigma and sd(group) answer different questions. The next code block is schematic; it assumes a dataset such as dat_repeated, with multiple rows per study and a study-level habitat value that is constant within each study after missing rows are removed:

fit_study <- drmTMB(
  drm_formula(
    yi ~ treatment + (1 | study) + meta_V(V = vi),
    sigma ~ 1,
    sd(study) ~ habitat
  ),
  family = gaussian(),
  data = dat_repeated
)

The symbolic model is:

$y_i = \mu_i + b_{j[i]} + r_i + s_i, \qquad \mu_i = \mathbf{x}_{\mu i}^{\top}\boldsymbol{\beta}_{\mu},$

$b_j \sim \operatorname{Normal}(0, \omega_j^2), \qquad \log(\omega_j) = \mathbf{x}_{\omega j}^{\top}\boldsymbol{\alpha},$

$r_i \sim \operatorname{Normal}(0, \sigma_i^2), \qquad \mathbf{s} \sim \operatorname{MVN}(\mathbf{0}, \mathbf{V}).$

Here s_i is the known sampling-error component supplied by meta_V(V = vi) or meta_V(V = V). The sigma formula models remaining observation-level heterogeneity. The sd(study) formula models \omega_j, the study-level random-effect standard deviation. Predictors in sd(study) ~ ... must be constant within each study after missing-data filtering; otherwise a single study-level random-effect SD would not be well defined.

After integrating over the random effects, the marginal covariance is:

$\boldsymbol{\Omega} = \mathbf{V} + \operatorname{diag}(\sigma_1^2,\ldots,\sigma_n^2) + \mathbf{Z}_{study} \operatorname{diag}(\omega_1^2,\ldots,\omega_J^2) \mathbf{Z}_{study}^{\top}.$

This is the distinction to keep in mind: sigma ~ habitat asks whether effect-size-level residual heterogeneity changes with habitat, while sd(study) ~ habitat asks whether the study-level random-effect SD changes with habitat.

Bivariate meta-analysis

Bivariate meta-analysis uses the same Gaussian bivariate route as other location-coscale models, where coscale means the residual correlation component rho12. The known sampling covariance should include the within-study covariance between the two effect-size estimates when it is available.

For study $i$ , write:

$\mathbf{y}_i = \begin{bmatrix} y_{1i} \\ y_{2i} \end{bmatrix}, \qquad \mathbf{S}_i = \begin{bmatrix} v_{1i} & c_{12i} \\ c_{12i} & v_{2i} \end{bmatrix}.$

The fitted residual covariance is:

$\boldsymbol{\Omega}_i = \begin{bmatrix} \sigma_{1i}^2 & \rho_{12i}\sigma_{1i}\sigma_{2i} \\ \rho_{12i}\sigma_{1i}\sigma_{2i} & \sigma_{2i}^2 \end{bmatrix}, \qquad \rho_{12i} = \tanh(\eta_{\rho12 i}).$

The full observation model is:

$\mathbf{y}_i \sim \operatorname{MVN} \left( \boldsymbol{\mu}_i, \mathbf{S}_i + \boldsymbol{\Omega}_i \right).$

The helper meta_vcov_bivariate() builds the common row-paired matrix:

V <- meta_vcov_bivariate(
  v1 = c(0.04, 0.03),
  v2 = c(0.05, 0.02),
  cor12 = c(0.4, 0.2)
)
V

The stacking is [y1_1, y2_1, y1_2, y2_2, ...]. In a fitted model, this known sampling covariance is added to the estimated residual covariance:

fit_biv <- drmTMB(
  drm_formula(
    mu1 = y1 ~ habitat + meta_V(V = V),
    mu2 = y2 ~ habitat,
    sigma1 = ~ habitat,
    sigma2 = ~ habitat,
    rho12 = ~ habitat
  ),
  family = c(gaussian(), gaussian()),
  data = dat
)

Here rho12 is the estimated residual correlation after the known within-study sampling covariance has already been included. In a model where the estimated residual component represents between-study heterogeneity, this is the between-study residual correlation. In other words, rho12 should not be asked to explain sampling correlation that belongs in V. The current implementation expects complete bivariate rows and a dense 2n by 2n row-paired V matrix.

For reporting, translate the fitted residual scales and residual correlation into marginal residual variances and residual covariance:

s <- sigma(fit_biv)
residual_variance_1 <- s$sigma1^2
residual_variance_2 <- s$sigma2^2
residual_covariance_12 <- rho12(fit_biv) * s$sigma1 * s$sigma2

These are residual heterogeneity summaries after the known sampling covariance has been accounted for. They are not replacements for the supplied V.

Implementation caveats

Dense full covariance is useful for correlated effect sizes and can represent block-diagonal covariance. Sparse covariance storage is planned for larger phylogenetic and spatial workloads.

Bivariate known-V fitting currently uses a dense row-paired matrix and complete bivariate rows. Missing single outcomes and sparse matrix storage need separate likelihood and user-interface decisions.

Negative and missing known variances are rejected or removed through the same row-retention rules as the model data; full matrices must be symmetric and positive semidefinite after row subsetting.

The API should keep family = gaussian() and sigma ~ ...; it should not introduce meta_gaussian() or tau ~ syntax without a separate design decision.

For large sparse covariance matrices, incomplete bivariate rows, or dense full-V models that also need non-unit likelihood weights, the current best practice is to simplify the fitted example, run a sensitivity analysis with the supported covariance structure, and treat the unsupported path as a design request rather than forcing it through weights =.