Parametric bootstrap for Sigma, correlations, communalities, and ICCs

Generates n_boot parametric bootstrap replicates of a fitted gllvmTMB_multi model and returns percentile confidence intervals for the canonical biological summaries: trait covariance matrices $\hat\Sigma_B$, $\hat\Sigma_W$; the corresponding correlation matrices $\hat R_B$, $\hat R_W$; per-trait communalities $c_t^2 = (\Lambda \Lambda^\top)_{tt} / \Sigma_{tt}$; and per-trait site-level ICCs $R_t = (\Sigma_B)_{tt} / [(\Sigma_B)_{tt} + (\Sigma_W)_{tt}]$.

Usage

bootstrap_Sigma(
  fit,
  n_boot = 200,
  level = c("unit", "unit_obs", "phy", "B", "W"),
  what = c("Sigma", "R", "communality", "ICC"),
  conf = 0.95,
  seed = NULL,
  n_cores = 1,
  progress = TRUE,
  keep_draws = FALSE
)

Arguments

fit: A gllvmTMB_multi object.
n_boot: Integer; number of bootstrap replicates. Default 200.
level: Character vector; which tier(s) to bootstrap. Subset of c("B", "W", "phy"). Tiers absent from the fit are silently dropped. Default: all three.
what: Character vector; which summaries to compute. Subset of c("Sigma", "R", "communality", "ICC"). Default: all four. "ICC" only makes sense at the site level and requires both B and W tiers in the fit.
conf: Numeric in (0, 1); confidence level for percentile CIs. Default 0.95.
seed: Optional RNG seed for reproducibility.
n_cores: Integer; number of cores for parallel refits. Default 1 (sequential). >= 2 uses future::multisession.
progress: Logical; print a one-line status message at each replicate (sequential only). Default TRUE.
keep_draws: Logical; if TRUE, the full n_boot x ... matrices of bootstrap draws are returned as $draws. Default FALSE (CIs only — saves memory for large n_boot).

Value

A list with components:

point_est: Named list of point estimates for each requested summary at each requested level (e.g. Sigma_B, R_B, communality_B, ICC_site).
ci_lower, ci_upper: Named lists of percentile CI bounds, element-wise the same shape as the corresponding point_est.
ci_method: Character; currently "percentile".
conf, n_boot, n_failed: Configuration metadata.
draws: NULL unless keep_draws = TRUE; otherwise a named list of bootstrap draw arrays.

Details

Each bootstrap replicate (1) draws a new response vector from simulate(fit, nsim = 1), (2) refits the model with the same formula on the simulated data, (3) extracts the requested summaries via extract_Sigma(), extract_communality(), and extract_ICC_site(). Replicates whose refit fails to converge are recorded but excluded from CI calculation.

Multicore is dispatched via future + future.apply; pass n_cores >= 2 to enable parallel refits. When parallel, replicates use future.apply's L'Ecuyer-CMRG seed stream so the answers are reproducible given a fixed seed, but they are NOT bit-identical to an n_cores = 1 run with the same seed (different RNG streams).

Caveats

Uses the existing simulate.gllvmTMB_multi() method, which conditions on the fitted random effects ($\eta = \hat\eta$) and adds Gaussian residual noise. CIs reflect residual-level uncertainty in the random-effect modes, not the full posterior uncertainty in the variance components. For non-Gaussian families the simulator is not yet implemented; this function will error.
Refits use the same formula reconstructed from fit$formula and fit$covstructs. Auxiliary arguments such as phylo_vcv, mesh, lambda_constraint are NOT currently forwarded; pass formula and extra_args explicitly via the low-level path if you need them.
Convergence: replicates whose refit fails or whose optimiser does not return convergence == 0 are counted in n_failed and excluded from CIs.

Examples

if (FALSE) { # \dontrun{
set.seed(1)
s <- simulate_site_trait(n_sites = 30, n_species = 1, n_traits = 3,
                         mean_species_per_site = 1,
                         Lambda_B = matrix(c(1, .5, -.4), 3, 1),
                         S_B = c(.2, .15, .1))
fit <- gllvmTMB(value ~ 0 + trait + latent(0 + trait | site, d = 1) +
                           unique(0 + trait | site),
                data = s$data)
boot <- bootstrap_Sigma(fit, n_boot = 50, level = "B",
                        what = c("Sigma", "R"), seed = 42)
boot$point_est$Sigma_B
boot$ci_lower$Sigma_B
boot$ci_upper$Sigma_B
} # }