Run a simulation benchmark for pigauto — simulate

Generates trait data under various evolutionary models, introduces missing data, fits both the Brownian motion baseline and the full pigauto GNN, and compares performance. This is the recommended way to assess pigauto on data with known properties before applying it to real data.

Usage

simulate_benchmark(
  n_species = 100L,
  n_traits = 4L,
  scenarios = c("BM", "OU", "regime_shift", "nonlinear", "mixed"),
  missing_frac = 0.25,
  n_reps = 3L,
  epochs = 500L,
  verbose = TRUE,
  ...
)

Arguments

n_species: integer. Number of tips in the simulated tree (default 100).
n_traits: integer. Number of continuous traits (default 4). Ignored for scenario = "mixed", which generates a fixed trait set.
scenarios: character vector. Subset of c("BM", "OU", "regime_shift", "nonlinear", "mixed"). Default runs all.
missing_frac: numeric. Fraction of observed cells held out (default 0.25).
n_reps: integer. Number of replicate trees per scenario (default 3).
epochs: integer. Maximum GNN training epochs (default 500).
verbose: logical. Print progress (default TRUE).
...: additional arguments passed to fit_pigauto.

Value

An object of class "pigauto_benchmark" with:

results: data.frame with columns: scenario, rep, method, trait, type, metric, value, n_test.
summary: data.frame averaged across replicates.
scenarios: character vector of scenarios run.
n_reps: integer.
n_species: integer.

Details

Available scenarios:

"BM": Pure Brownian motion – the baseline is exact, so the GNN should tie or slightly improve via inter-trait correlations.
"OU": Ornstein-Uhlenbeck – stabilising selection constrains variation. BM over-estimates evolutionary variance.
"regime_shift": Two-regime BM – clade-specific optima create bimodal distributions that BM cannot capture.
"nonlinear": Non-linear inter-trait relationships – the GNN's multi-layer message passing can capture quadratic and interaction effects that BM's linear covariance misses.
"mixed": Mixed trait types: 2 continuous + 1 binary + 1 categorical (3 levels). Tests the full type pipeline.

Examples

if (FALSE) { # \dontrun{
bench <- simulate_benchmark(n_species = 50, epochs = 200, n_reps = 2)
bench$summary
plot(bench)
} # }