Phylogenetic trait covariance • gllvmTMB

This article is for a comparative-methods reader with several traits measured once per species. The working question is: can we separate phylogenetically conserved trait covariance from non-phylogenetic species-level covariance, while still keeping trait-specific variation on the diagonal?

The model has two species-level trait-covariance matrices. The phylogenetic component uses the tree correlation matrix A; the non-phylogenetic component uses independent species effects:

$g_{\mathrm{phy}} \sim \mathrm{MVN}(0, \Sigma_{\mathrm{phy}} \otimes A), \qquad g_{\mathrm{non}} \sim \mathrm{MVN}(0, \Sigma_{\mathrm{non}} \otimes I).$

Each covariance can use the same latent() + unique() decomposition:

$\Sigma_{\mathrm{phy}} = \Lambda_{\mathrm{phy}}\Lambda_{\mathrm{phy}}^T + S_{\mathrm{phy}}, \qquad \Sigma_{\mathrm{non}} = \Lambda_{\mathrm{non}}\Lambda_{\mathrm{non}}^T + S_{\mathrm{non}}.$

Here Lambda is the shared low-rank structure and S is the trait-specific diagonal variance. The old project nickname “two-U” means this paired phylogenetic and non-phylogenetic decomposition; the math in current gllvmTMB uses S / s for the unique diagonal.

Setup

library(gllvmTMB)
library(ape)

Simulate

We simulate three traits on 30 species. The truth has four components, matching the paired decomposition in the theory section: a phylogenetic shared axis (Lambda_phy), a phylogenetic diagonal component (s_phy), a non-phylogenetic shared axis (Lambda_non), and a non-phylogenetic diagonal component (s_non). This small example is meant to show syntax and extraction, not to be a full simulation study.

set.seed(13)
n_sp <- 30
n_traits <- 3
trait_names <- paste0("trait_", seq_len(n_traits))

tree <- ape::rcoal(n_sp)
tree$tip.label <- paste0("sp", seq_len(n_sp))
Cphy <- ape::vcv(tree, corr = TRUE)
Lphy <- t(chol(Cphy + 1e-8 * diag(n_sp)))

Lambda_phy_true <- matrix(c(0.70, 0.45, 0.25), n_traits, 1)
Lambda_non_true <- matrix(c(0.50, 0.30, 0.20), n_traits, 1)
s_phy_true <- c(0.20, 0.15, 0.12)
s_non_true <- c(0.18, 0.20, 0.16)

g_phy <- Lphy %*% rnorm(n_sp)
g_non <- matrix(rnorm(n_sp), n_sp, 1)
e_phy <- Lphy %*% matrix(rnorm(n_sp * n_traits), n_sp, n_traits)
e_non <- matrix(rnorm(n_sp * n_traits), n_sp, n_traits)

Y <- g_phy %*% t(Lambda_phy_true) +
  sweep(e_phy, 2, sqrt(s_phy_true), "*") +
  g_non %*% t(Lambda_non_true) +
  sweep(e_non, 2, sqrt(s_non_true), "*")
rownames(Y) <- tree$tip.label
colnames(Y) <- trait_names

df <- data.frame(
  species = factor(rep(tree$tip.label, each = n_traits),
                   levels = tree$tip.label),
  trait = factor(rep(trait_names, times = n_sp),
                 levels = trait_names),
  value = as.numeric(t(Y))
)

df_wide <- data.frame(
  species = factor(tree$tip.label, levels = tree$tip.label),
  Y,
  check.names = FALSE
)
head(df)
#>   species   trait       value
#> 1     sp1 trait_1 -0.01290262
#> 2     sp1 trait_2 -0.63322824
#> 3     sp1 trait_3 -0.25682149
#> 4     sp2 trait_1  0.37488457
#> 5     sp2 trait_2  0.33140862
#> 6     sp2 trait_3  0.92548573

Fit

The long form is canonical: one row per (species, trait) observation. The wide data-frame form is the same model written from one row per species and one column per trait. Because traits(...) names the response traits on the left-hand side, the right-hand side can use the compact covariance syntax.

# Long format (canonical)
fit_long <- gllvmTMB(
  value ~ 0 + trait +
    phylo_latent(species, d = 1, tree = tree) +
    phylo_unique(species, tree = tree) +
    latent(0 + trait | species, d = 1) +
    unique(0 + trait | species),
  data = df,
  unit = "species",
  cluster = "species"
)

# Wide data-frame format (same model with compact formula syntax)
fit_wide <- gllvmTMB(
  traits(trait_1, trait_2, trait_3) ~ 1 +
    phylo_latent(species, d = 1, tree = tree) +
    phylo_unique(species, tree = tree) +
    latent(1 | species, d = 1) +
    unique(1 | species),
  data = df_wide,
  unit = "species",
  cluster = "species"
)

fit <- fit_long
fit$opt$convergence
#> [1] 0

cat("Long/wide logLik difference:",
    signif(abs(as.numeric(logLik(fit_long)) -
               as.numeric(logLik(fit_wide))), 3), "\n")
#> Long/wide logLik difference: 0

The two forms dispatch to the same long-format engine. In this model, phylo_latent(species, d = 1, tree = tree) estimates the shared phylogenetic axis, phylo_unique(species, tree = tree) estimates the trait-specific phylogenetic diagonal, latent(1 | species, d = 1) in the wide formula (expanding to latent(0 + trait | species, d = 1) in the long form) estimates the shared non-phylogenetic axis, and unique(1 | species) (expanding to unique(0 + trait | species)) estimates the trait-specific non-phylogenetic diagonal.

gllvmTMB_wide() is not the right shortcut for this row-phylogeny example. That matrix entry point treats matrix columns as the trait axis and is most natural for site-by-species response matrices, where the phylogeny belongs to the response columns. Here the phylogeny belongs to the row unit species, so the formula-wide traits(...) path is the readable wide form.

Recover the phylogenetic covariance

extract_Sigma(level = "phy") returns the phylogenetic trait covariance. The shared part is Lambda_phy %*% t(Lambda_phy); the unique part is the diagonal s_phy; the total is their sum.

Sigma_phy_shared <- extract_Sigma(fit, level = "phy", part = "shared")$Sigma
Sigma_phy_unique <- extract_Sigma(fit, level = "phy", part = "unique")$s
Sigma_phy_total <- extract_Sigma(fit, level = "phy", part = "total")$Sigma

round(Sigma_phy_shared, 2)
#>         trait_1 trait_2 trait_3
#> trait_1    1.91    0.42    0.29
#> trait_2    0.42    0.09    0.06
#> trait_3    0.29    0.06    0.05
round(Sigma_phy_unique, 2)
#> trait_1 trait_2 trait_3 
#>    0.00    0.05    0.02
round(Sigma_phy_total, 2)
#>         trait_1 trait_2 trait_3
#> trait_1    1.91    0.42    0.29
#> trait_2    0.42    0.15    0.06
#> trait_3    0.29    0.06    0.06

The non-phylogenetic species-level covariance has the same paired decomposition as the phylogenetic side:

Sigma_non_shared <- extract_Sigma(fit, level = "unit", part = "shared")$Sigma
Sigma_non_unique <- extract_Sigma(fit, level = "unit", part = "unique")$s
Sigma_non_total  <- extract_Sigma(fit, level = "unit", part = "total")$Sigma

round(Sigma_non_shared, 2)
#>         trait_1 trait_2 trait_3
#> trait_1    0.21    0.17    0.09
#> trait_2    0.17    0.13    0.07
#> trait_3    0.09    0.07    0.04
round(Sigma_non_unique, 2)
#> trait_1 trait_2 trait_3 
#>    0.06    0.04    0.07
round(Sigma_non_total, 2)
#>         trait_1 trait_2 trait_3
#> trait_1    0.27    0.17    0.09
#> trait_2    0.17    0.17    0.07
#> trait_3    0.09    0.07    0.11

For a small tree, do not over-interpret either split (between Lambda_phy Lambda_phy^T and S_phy, or between Lambda_non Lambda_non^T and S_non). The more stable biological targets are often the totals Sigma_phy and Sigma_non, especially when the tree is small or the ranks are exploratory.

Communality (within each tier)

Communality c_t^2 is the fraction of a trait’s within-tier variance explained by the shared latent axis. It is defined only when both a latent() and a unique() term are fit at that tier; that is what the four-component decomposition gives us on both the phylogenetic and non-phylogenetic sides.

# Non-phylogenetic communality is one extractor call.
c2_non <- extract_communality(fit, level = "unit")
round(c2_non, 2)
#> trait_1 trait_2 trait_3 
#>    0.78    0.77    0.35

# Phylogenetic communality is the same ratio, computed from the
# Sigma_phy decomposition above.
c2_phy <- diag(Sigma_phy_shared) / diag(Sigma_phy_total)
round(c2_phy, 2)
#> trait_1 trait_2 trait_3 
#>    1.00    0.64    0.71

A value near 1 means the trait’s within-tier variance is dominated by the shared axis (high “integration” with the other traits at that tier). A value near 0 means the trait is mostly idiosyncratic at that tier.

Phylogenetic heritability

Phylogenetic heritability H^2 is a between-tier ratio: the fraction of total trait variance attributable to the phylogenetic component (both shared and unique parts of Sigma_phy) versus the non-phylogenetic component (both shared and unique parts of Sigma_non). It is distinct from the within-tier communalities above.

The convenience extractor returns the full three-way decomposition H^2 + C^2_non + Psi = 1 per trait, where:

H^2 = phylogenetic share = diag(Sigma_phy_total) / V_eta
C^2_non = non-phylogenetic shared share = diag(Lambda_non Lambda_non^T) / V_eta
Psi = non-phylogenetic unique share = s_non / V_eta

with V_eta = diag(Sigma_phy_total) + diag(Lambda_non Lambda_non^T) + s_non.

extract_phylo_signal(fit)
#>     trait        H2     C2_non        Psi     V_eta
#> 1 trait_1 0.8748268 0.09769067 0.02748253 2.1885138
#> 2 trait_2 0.4573238 0.41618821 0.12648803 0.3172061
#> 3 trait_3 0.3644067 0.22477253 0.41082074 0.1739173

C^2_non is structurally zero when the formula omits the non-phylogenetic latent() term, so the three-way split collapses to H^2 + Psi. Adding latent(0 + trait | species, d = K) (long) or latent(1 | species, d = K) (wide), as this article does, is what makes C^2_non a genuine non-phylogenetic shared-axis component rather than a placeholder.

Compare against simpler and saturated baselines

Three nested covariance intents bracket the four-component model:

Independent (phylo_indep + indep): per-trait diagonal only at each tier. No shared axis, no cross-trait covariance. The simplest model that respects the phylogenetic structure.
Two-U (phylo_latent + phylo_unique + latent + unique): the four-component decomposition fit in this article. A shared low-rank axis plus a trait-specific diagonal at each tier.
Dependent (phylo_dep + dep): a full unstructured covariance at each tier. Saturated; no shared/unique split is enforced.

The two-U decomposition sits between independent (too restrictive when traits are correlated) and dependent (too many parameters when the number of traits grows). The cross-check functions refit the data under the independent and dependent baselines and compare each component’s implied total covariance against the two-U fit.

diag_indep <- compare_indep_vs_two_U(fit)
diag_indep$agreement
#>        component       rmse indep_mag rel_disagreement  flag
#> 1 Sigma_phy_diag 0.07416986 1.0629839       0.06977515 FALSE
#> 2 Sigma_non_diag 0.07211605 0.1270333       0.56769410  TRUE
diag_indep$flag
#> [1] TRUE

diag_dep <- compare_dep_vs_two_U(fit)
diag_dep$agreement
#>   component       rmse   dep_mag rel_disagreement  flag
#> 1 Sigma_phy 0.02913488 0.6976876        0.0417592 FALSE
#> 2 Sigma_non 0.11354069 0.1311863        0.8654919  TRUE
diag_dep$flag
#> [1] TRUE

Each $agreement row shows the Frobenius relative disagreement between the two-U fit and the baseline (rmse / baseline_mag). flag = TRUE means relative disagreement exceeds 0.10. A flag is not a failed model; it tells you the data do not strongly support the chosen decomposition at the requested rank.

Use compare_indep_vs_two_U() first when the number of traits is large – it only refits the diagonal and stays cheap. Use compare_dep_vs_two_U() when the number of traits is modest and the saturated refit is affordable – it is the more informative benchmark for off-diagonal cross-trait structure.

When to simplify

The paired phylo_latent() + phylo_unique() + latent() + unique() model is useful when the question is about phylogenetically conserved trait syndromes alongside non-phylogenetic shared-axis structure. Simplify when the data do not identify the splits the model asks for:

fit_phylo_dep <- gllvmTMB(
  value ~ 0 + trait + phylo_dep(0 + trait | species, tree = tree) +
                      dep(0 + trait | species),
  data = df,
  unit = "species",
  cluster = "species"
)

phylo_dep() + dep() estimates full unstructured trait covariances at each tier without asking the data to separate a low-rank shared axis from the diagonal. It is the natural fallback when the two-U diagnostics flag.

What to read next

compare_dep_vs_two_U() and compare_indep_vs_two_U() are the likelihood-based cross-checks for this model.
functional-biogeography shows how phylogenetic terms behave as controls in a larger site–species–trait model.
morphometrics is the simplest non-phylogenetic latent() + unique() worked example.

References: Hadfield and Nakagawa (2010) for the phylogenetic mixed model representation; Meyer and Kirkpatrick (2008) for reduced-rank covariance estimation; Williams et al. (2025) for fast phylogenetic GLMMs in the glmmTMB ecosystem.