Formula-LHS marker that lets gllvmTMB() accept a wide data frame
(one row per individual, one column per trait) instead of the
canonical long-format (unit, trait) data.
Value
A formula marker; never evaluated as a function call. The
parser recognises traits(...) on the LHS of a gllvmTMB()
formula and dispatches to the wide-format pivot pre-pass.
Details
The package thinks in two shapes, long or wide:
long:
gllvmTMB(value ~ ..., data = df_long, ...)– one row per(unit, trait)observation.wide data frame:
gllvmTMB(traits(t1, t2, ...) ~ ..., data = df_wide, ...)– one row per unit, one column per trait, with compact formula syntax.wide matrix:
gllvmTMB_wide(Y, ...)– a numeric matrix or data frame wrapper for matrix-first workflows.
All paths reach the same long-format engine; the user picks whichever shape matches their data on disk.
Because the LHS already names the response traits, the RHS can use a
compact wide shorthand. 1 expands to the trait-specific intercepts
0 + trait; ordinary predictors such as env_temp expand to
(0 + trait):env_temp; and latent(1 | individual) /
unique(1 | individual) expand to the long covariance syntax
latent(0 + trait | individual) / unique(0 + trait | individual).
The same 1 | group shorthand is recognised for indep(),
dep(), bar-style phylo_indep() / phylo_dep(), and the
spatial_*() keywords. Species-axis phylogenetic keywords such as
phylo_latent(species, d = K) already name their phylogenetic axis
and pass through unchanged. Ordinary random-intercept terms such as
(1 | batch) also pass through unchanged.
gllvmTMB(
traits(sleep, mass, lifespan, brain) ~ 1 + env_temp +
latent(1 | individual, d = 2) +
unique(1 | individual),
data = wide_df,
unit = "individual",
family = gaussian()
)Internally traits() is implemented as a tidyr::pivot_longer()
pre-pass: the wide data is pivoted to long format with trait as a
factor column (levels in the order the user supplied to traits())
and .y_wide_ as the response column; the LHS of the formula is
rewritten from traits(...) to .y_wide_; and the compact RHS is
expanded to the trait-stacked long syntax before dispatch. The
explicit long RHS remains accepted, so existing calls that already
write 0 + trait and latent(0 + trait | group) keep working.
Tidyselect verbs are supported because traits() forwards its
arguments to tidyr::pivot_longer(cols = ...):
traits(all_of(cols)), traits(starts_with("sp")),
traits(matches("^y[0-9]+$")), traits(any_of(c("a", "b"))), and
bare names all work.
Cells with NA responses are dropped via
pivot_longer(values_drop_na = TRUE) — the canonical default. Users
who want strict listwise drop should pre-filter the wide data before
calling.
Mixed-family fits (family = list(...) keyed by trait) flow through
the long-format engine; traits() does not intercept the family
argument. Per-row weight vectors of length nrow(data) are also
replicated across traits automatically, then passed to the same
long-format weight path used by gllvmTMB(). For per-cell weight
matrices use the matrix-in entry point gllvmTMB_wide().
See also
gllvmTMB() for the long-format engine, gllvmTMB_wide()
for the matrix-in API (use that when you have per-cell weight
matrices or come from a gllvm-style workflow). The source-tree
contract is docs/design/02-data-shape-and-weights.md.
