Canonical likelihood-based cross-check for the paired phylogenetic decomposition

Refits the user's data with full unstructured T x T phylogenetic and non-phylogenetic trait covariances (phylo_dep + dep) using the same engine, family, link, and unit / cluster grouping as the supplied two-U fit. Compares the joint two-U fit's implied $\boldsymbol\Sigma_{\mathrm{phy}} = \boldsymbol\Lambda_{\mathrm{phy}}\boldsymbol\Lambda_{\mathrm{phy}}^\top + \mathbf S_{\mathrm{phy}}$ (and the analogous $\boldsymbol\Sigma_{\mathrm{non}}$) against the unstructured baseline. Per-component RMSE plus a flag field (TRUE when any component disagrees beyond threshold) identifies two-U identifiability failures.

Usage

compare_dep_vs_two_U(
  fit_two_U,
  threshold = 0.1,
  phylo_vcv = NULL,
  phylo_tree = NULL
)

Arguments

fit_two_U: A gllvmTMB_multi joint two-U fit, e.g. produced by gllvmTMB(value ~ 0 + trait + phylo_latent(species, d = K_phy) + phylo_unique(species) + unique(0 + trait | species), ...). The cross-check refits the same data and family with phylo_dep + dep and compares.
threshold: Numeric (default 0.10): relative-disagreement threshold (Frobenius RMSE / Frobenius magnitude of the unstructured estimate) above which a component is flagged.
phylo_vcv, phylo_tree: Optional. The phylogenetic correlation matrix or ape::phylo tree, only needed if fit_two_U was produced by an older package version that did not store the phylogeny on the fit. Default is to recover them from the fit (fit_two_U$phylo_vcv / fit_two_U$phylo_tree).

Value

A list with components:

joint: The two-U fit's implied Sigma_phy and Sigma_non (T x T matrices).
dep: The unstructured phylo_dep + dep baseline's implied Sigma_phy and Sigma_non (T x T matrices). May contain NULL entries if the alt fit failed.
agreement: Data frame with rows for Sigma_phy and Sigma_non, columns rmse (Frobenius RMSE between joint and dep), dep_mag (Frobenius magnitude of the dep estimate), rel_disagreement (rmse / dep_mag), and flag (rel_disagreement > threshold).
flag: Logical: TRUE if any component is flagged.
threshold: The threshold used.
alt_fit: The refit gllvmTMB_multi object (or NULL on failure), retained so users can inspect convergence and pull other extractor outputs.

Details

Conceptual basis: Williams et al. (2025) bioRxiv 2025.12.20.695312 Eq. 3, generalised across traits. The unstructured fit has $T(T+1)/2$ free parameters per tier and is the strongest available benchmark for the implied total covariance. When the joint two-U fit is well-identified, its implied total Sigma matches the unstructured baseline; when it isn't, the diagnostic flags the disagreement and the user knows to relax the rank or revisit identifiability.

Cross-check intent corresponds to the maintainer's three-levels-of- success framing (dev/two-U-rewrite-plan.md):

Level 1: can we fit?: Both fits converged.
Level 2: total Sigma_phy and Sigma_non agree?: This diagnostic answers Level 2 directly. The two estimators target the SAME total covariance with different parameterisations, so agreement is the identifiability check.
Level 3: split into Lambda Lambda^T + S?: Diagnostic does not test Level 3 directly; Level 2 agreement is the pre-requisite for Level 3 to be meaningful.

Computational scope: tractable for T <= ~30. For larger T, prefer compare_indep_vs_two_U().

References

Williams, M. J., McGillycuddy, M., Drobniak, S. M., Bolker, B. M., Warton, D. I., & Nakagawa, S. (2025). Fast phylogenetic generalised linear mixed-effects modelling using the glmmTMB R package. bioRxiv 2025.12.20.695312. doi:10.1101/2025.12.20.695312

Hadfield, J. D. & Nakagawa, S. (2010). General quantitative genetic methods for comparative biology: phylogenies, taxonomies and multi- trait models for continuous and categorical characters. Journal of Evolutionary Biology 23, 494-508. doi:10.1111/j.1420-9101.2009.01915.x

Meyer, K. & Kirkpatrick, M. (2008). Perils of parsimony: properties of reduced-rank estimates of genetic covariance matrices. Genetics 180, 1153-1166. doi:10.1534/genetics.108.090159

Felsenstein, J. (2005). Using the quantitative genetic threshold model for inferences between and within populations. Genetics 169, 925-942. doi:10.1534/genetics.104.025262

Felsenstein, J. (2012). A comparative method for both discrete and continuous characters using the threshold model. American Naturalist 179, 145-156. doi:10.1086/663681

Examples

if (FALSE) { # \dontrun{
library(ape)
tree <- ape::rcoal(200)
tree$tip.label <- paste0("sp", seq_len(200))
Cphy <- ape::vcv(tree, corr = TRUE)
fit  <- gllvmTMB(
  value ~ 0 + trait + phylo_latent(species, d = 1) +
          phylo_unique(species) + unique(0 + trait | species),
  data = df, phylo_vcv = Cphy, cluster = "species"
)
diag <- compare_dep_vs_two_U(fit)
diag$flag
diag$agreement
} # }