Skip to contents

Implements the decomposition $$\boldsymbol\Sigma_\text{tier} \;=\; \underbrace{\boldsymbol\Lambda_\text{tier}\boldsymbol\Lambda_\text{tier}^\top}_{\text{shared (latent)}} \;+\; \underbrace{\mathbf S_\text{tier}}_{\text{unique (unique)}},$$ where \(\boldsymbol\Lambda\) comes from the latent() term at that tier and \(\mathbf S\) comes from the corresponding unique() term. This is the same decomposition the behavioural-syndromes / phenotypic-integration literature uses (Bartholomew et al. 2011; Nakagawa et al. in prep), equations 15, 22, and 30 of the methods paper.

Usage

extract_Sigma(
  fit,
  level = c("unit", "unit_obs", "phy", "spatial", "cluster", "B", "W", "spde"),
  part = c("total", "shared", "unique"),
  link_residual = c("auto", "none"),
  .skip_warn = FALSE
)

Arguments

fit

A gllvmTMB_multi fit.

level

One of "unit" (between-unit), "unit_obs" (within-unit), "phy" (phylogenetic), "spatial", or "cluster". Legacy aliases "B", "W", and "spde" are accepted with a soft-deprecation message.

part

One of "total" (default), "shared", "unique".

For non-Gaussian fits. "auto" (default) adds a per-trait link-specific implicit residual variance to the diagonal of Sigma, giving the marginal latent-scale interpretation; in mixed- family fits each trait gets the residual implied by its family/link (see "Family-aware link residuals" below for the full table). "none" returns the latent+unique-implied \(\boldsymbol\Lambda \boldsymbol\Lambda^\top + \mathbf S\) with no implicit residual added. For Gaussian or lognormal-only fits this argument is effectively a no-op (their implied \(\sigma^2_d = 0\)).

.skip_warn

Internal flag (default FALSE). When TRUE, suppresses the once-per-session deprecation message that the internal .normalise_level() helper emits for legacy level aliases ("B", "W", "spde", "total"). Used by internal callers (e.g. extract_Omega()) that have already issued the deprecation message themselves; not part of the public API.

Value

For part = "total" or "shared": a list with components Sigma (T x T matrix), R (T x T correlation matrix; only for "total"), level, part, and note (character vector of advisory messages, e.g. about a missing unique() term).

For part = "unique": a list with s (length-T named numeric vector of unique variances), level, part, note.

Details

Why both latent() and unique() matter

If the formula has only latent(0 + trait | unit, d = K) and no unique(0 + trait | unit), the engine can only fit the \(\boldsymbol\Lambda \boldsymbol\Lambda^\top\) component – there is no slot for trait-specific unique variance \(\mathbf S\). Calling extract_Sigma(fit, level, part = "total") then returns just \(\boldsymbol\Lambda \boldsymbol\Lambda^\top\), which understates the diagonal of the true covariance. Any correlations computed from this incomplete \(\hat{\boldsymbol\Sigma}\) are systematically inflated (the same numerator with a too-small denominator).

For Gaussian / lognormal / Gamma fits this function emits a one-shot message reminding the user to add + unique(0 + trait | unit) (or its within-unit analogue) when this happens. Add the unique() term and the decomposition is complete.

For non-Gaussian families (binomial, Poisson, Gamma) the latent-scale residual variance has a closed-form approximation that should be added to the diagonal of \(\boldsymbol\Sigma\) – see the link_residual argument below.

The part argument

"total" (default)

\(\boldsymbol\Sigma_\text{tier} = \boldsymbol\Lambda \boldsymbol\Lambda^\top + \mathbf S\) – the matrix users almost always want for reporting correlations.

"shared"

\(\boldsymbol\Lambda \boldsymbol\Lambda^\top\) only – the reduced-rank, rotation-invariant component. Diagonals are \(\sum_\ell \Lambda_{t\ell}^2\); these are not the trait variances, they are the shared part of the trait variances.

"unique"

\(\mathbf S_\text{tier}\) only – the trait-specific unique variances, returned as a length-T named numeric vector (the diagonal of \(\mathbf S\)).

Caveat: "shared" vs "unique" partition is only weakly identified

The total \(\boldsymbol\Sigma_\text{tier} = \boldsymbol\Lambda \boldsymbol\Lambda^\top + \mathbf S\) is rotation-invariant and well-identified, so part = "total" is well-identified. But the split between \(\boldsymbol\Lambda \boldsymbol\Lambda^\top\) and \(\mathbf S\) is only weakly identified – different optimiser starts can flow trait \(t\)'s variance more into the shared ("shared") or unique ("unique") component, with the same total likelihood. In a synthetic Poisson + OLRE recovery run with target \(\sigma^2_S = (0.5, 0.4, 0.3, 0.6)\), gllvmTMB returned \((0, 0.67, 0, 0.57)\) – total trait variances correct, but the partition arbitrary.

This is the standard rotation-and-shift indeterminacy of factor models (gllvm and Hmsc have it too); communality \(c_t^2\) computed from a single fit therefore inherits the same indeterminacy. For interpretation, lean on part = "total" for correlations and use extract_communality() only when paired with bootstrap_Sigma() or the variance-decomposition consistency checks in gllvmTMB_diagnose().

For non-Gaussian responses each row carries an implicit observation-level residual on the latent (link) scale. Adding it to the per-trait diagonal of \(\boldsymbol\Sigma\) is what makes cross-family Sigma comparable on the same (latent) scale: without this adjustment, binomial diagonals are too small relative to a Gaussian latent of comparable variance and the implied cross-family correlations are inflated.

Per-family formulas (Nakagawa & Schielzeth 2010; Nakagawa, Johnson & Schielzeth 2017):

gaussian (identity)\(\sigma^2_d = 0\)
binomial(link = "logit")\(\sigma^2_d = \pi^2/3 \approx 3.290\)
binomial(link = "probit")\(\sigma^2_d = 1\)
binomial(link = "cloglog")\(\sigma^2_d = \pi^2/6 \approx 1.645\)
poisson(link = "log")\(\sigma^2_d = \log(1 + 1/\hat\mu_t)\) (lognormal-Poisson approx.)
lognormal(link = "log")\(\sigma^2_d = 0\) (sigma_eps already models the log-scale residual)
Gamma(link = "log")\(\sigma^2_d = \psi'(\hat\nu)\) where \(\hat\nu = 1/\hat\sigma_\varepsilon^2\) is the shape
nbinom2(link = "log")\(\sigma^2_d = \psi'(\hat\phi)\) where \(\hat\phi\) is the per-trait NB2 dispersion
tweedie(link = "log")\(\sigma^2_d = \log(1 + \hat\phi \hat\mu_t^{\hat p - 2})\) (delta method)
Beta(link = "logit")\(\sigma^2_d = \psi'(\hat\mu_t \hat\phi) + \psi'((1 - \hat\mu_t)\hat\phi)\) (Smithson & Verkuilen 2006)
betabinomial(link = "logit")\(\sigma^2_d = \pi^2/3 + \psi'(\hat\mu_t \hat\phi) + \psi'((1 - \hat\mu_t)\hat\phi)\)

For mixed-family fits the residual is computed per trait from the family of the rows belonging to that trait, then added to the diagonal of \(\boldsymbol\Sigma\) entry-by-entry. The default link_residual = "auto" applies this; "none" returns the latent+unique-implied \(\boldsymbol\Lambda \boldsymbol\Lambda^\top + \mathbf S\) with no implicit residual added. For continuous-only Gaussian or lognormal fits "auto" is a no-op (their per-trait \(\sigma^2_d\) is zero).

Citations: Nakagawa & Schielzeth (2010); Nakagawa, Johnson & Schielzeth (2017) — see References below.

Future: 3+ latent tiers

The engine currently supports two latent() tiers ("unit" and "unit_obs"); legacy level = "B" and "W" aliases are still accepted with a soft-deprecation message. level = "phy" extracts the phylogenetic implied \(\boldsymbol\Sigma_\text{phy}\) from phylo_latent(). If a future release adds 3+ latent tiers, level = "<colname>" will dispatch to the corresponding tier without API change. For now, custom strings error with a clear roadmap message.

References

Nakagawa, S. & Schielzeth, H. (2010). Repeatability for Gaussian and non-Gaussian data: a practical guide for biologists. Biological Reviews 85, 935-956. doi:10.1111/j.1469-185X.2010.00141.x

Nakagawa, S., Johnson, P. C. D., & Schielzeth, H. (2017). The coefficient of determination \(R^2\) and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded. Journal of the Royal Society Interface 14(134), 20170213. doi:10.1098/rsif.2017.0213

See also

extract_communality() for the per-trait shared / unique variance share at one tier; extract_proportions() for the canonical per-trait variance decomposition across tiers; extract_Omega() for the multi-tier sum.

Examples

if (FALSE) { # \dontrun{
fit <- gllvmTMB(
  value ~ 0 + trait +
          latent(0 + trait | unit, d = 2) + unique(0 + trait | unit),
  data = df
)
extract_Sigma(fit, level = "unit", part = "total")$Sigma   # full T x T cov
extract_Sigma(fit, level = "unit", part = "shared")$Sigma  # rr-only
extract_Sigma(fit, level = "unit", part = "unique")$s      # diag(s_unit)
} # }