
Canonical likelihood-based cross-check for the paired phylogenetic decomposition
Source:R/extract-two-U-cross-check.R
compare_dep_vs_two_U.RdRefits the user's data with full unstructured T x T phylogenetic and
non-phylogenetic trait covariances (phylo_dep + dep) using the same
engine, family, link, and unit / cluster grouping as the supplied
two-U fit. Compares the joint two-U fit's implied
\(\boldsymbol\Sigma_{\mathrm{phy}} = \boldsymbol\Lambda_{\mathrm{phy}}\boldsymbol\Lambda_{\mathrm{phy}}^\top + \mathbf S_{\mathrm{phy}}\)
(and the analogous \(\boldsymbol\Sigma_{\mathrm{non}}\)) against
the unstructured baseline. Per-component RMSE plus a flag field
(TRUE when any component disagrees beyond threshold) identifies
two-U identifiability failures.
Arguments
- fit_two_U
A
gllvmTMB_multijoint two-U fit, e.g. produced bygllvmTMB(value ~ 0 + trait + phylo_latent(species, d = K_phy) + phylo_unique(species) + unique(0 + trait | species), ...). The cross-check refits the same data and family withphylo_dep + depand compares.- threshold
Numeric (default
0.10): relative-disagreement threshold (Frobenius RMSE / Frobenius magnitude of the unstructured estimate) above which a component is flagged.- phylo_vcv, phylo_tree
Optional. The phylogenetic correlation matrix or
ape::phylotree, only needed iffit_two_Uwas produced by an older package version that did not store the phylogeny on the fit. Default is to recover them from the fit (fit_two_U$phylo_vcv/fit_two_U$phylo_tree).
Value
A list with components:
jointThe two-U fit's implied
Sigma_phyandSigma_non(T x T matrices).depThe unstructured
phylo_dep + depbaseline's impliedSigma_phyandSigma_non(T x T matrices). May containNULLentries if the alt fit failed.agreementData frame with rows for
Sigma_phyandSigma_non, columnsrmse(Frobenius RMSE between joint and dep),dep_mag(Frobenius magnitude of the dep estimate),rel_disagreement(rmse / dep_mag), andflag(rel_disagreement > threshold).flagLogical:
TRUEif any component is flagged.thresholdThe threshold used.
alt_fitThe refit
gllvmTMB_multiobject (orNULLon failure), retained so users can inspect convergence and pull other extractor outputs.
Details
Conceptual basis: Williams et al. (2025) bioRxiv 2025.12.20.695312 Eq. 3, generalised across traits. The unstructured fit has \(T(T+1)/2\) free parameters per tier and is the strongest available benchmark for the implied total covariance. When the joint two-U fit is well-identified, its implied total Sigma matches the unstructured baseline; when it isn't, the diagnostic flags the disagreement and the user knows to relax the rank or revisit identifiability.
Cross-check intent corresponds to the maintainer's three-levels-of-
success framing (dev/two-U-rewrite-plan.md):
- Level 1: can we fit?
Both fits converged.
- Level 2: total Sigma_phy and Sigma_non agree?
This diagnostic answers Level 2 directly. The two estimators target the SAME total covariance with different parameterisations, so agreement is the identifiability check.
- Level 3: split into Lambda Lambda^T + S?
Diagnostic does not test Level 3 directly; Level 2 agreement is the pre-requisite for Level 3 to be meaningful.
Computational scope: tractable for T <= ~30. For larger T, prefer
compare_indep_vs_two_U().
References
Williams, M. J., McGillycuddy, M., Drobniak, S. M., Bolker, B. M., Warton, D. I., & Nakagawa, S. (2025). Fast phylogenetic generalised linear mixed-effects modelling using the glmmTMB R package. bioRxiv 2025.12.20.695312. doi:10.1101/2025.12.20.695312
Hadfield, J. D. & Nakagawa, S. (2010). General quantitative genetic methods for comparative biology: phylogenies, taxonomies and multi- trait models for continuous and categorical characters. Journal of Evolutionary Biology 23, 494-508. doi:10.1111/j.1420-9101.2009.01915.x
Meyer, K. & Kirkpatrick, M. (2008). Perils of parsimony: properties of reduced-rank estimates of genetic covariance matrices. Genetics 180, 1153-1166. doi:10.1534/genetics.108.090159
Felsenstein, J. (2005). Using the quantitative genetic threshold model for inferences between and within populations. Genetics 169, 925-942. doi:10.1534/genetics.104.025262
Felsenstein, J. (2012). A comparative method for both discrete and continuous characters using the threshold model. American Naturalist 179, 145-156. doi:10.1086/663681
See also
compare_indep_vs_two_U() for the cheap diagonal fallback
when T is large; extract_Sigma() for the underlying covariance
extractor.
Examples
if (FALSE) { # \dontrun{
library(ape)
tree <- ape::rcoal(200)
tree$tip.label <- paste0("sp", seq_len(200))
Cphy <- ape::vcv(tree, corr = TRUE)
fit <- gllvmTMB(
value ~ 0 + trait + phylo_latent(species, d = 1) +
phylo_unique(species) + unique(0 + trait | species),
data = df, phylo_vcv = Cphy, cluster = "species"
)
diag <- compare_dep_vs_two_U(fit)
diag$flag
diag$agreement
} # }