Proportion-trait benchmark: signal strength sweep

Tree: ape::rtree(300) · Traits: 4 proportion per scenario · Signal: 0.2 – 1.0 · Methods: mean · BM baseline (Rphylopars) · pigauto · Replicates: 5 · Missingness: 25% MCAR · Commit 794537121b · Report generated 2026-05-30 12:04 · Total wall: 107.1 min

Bottom line. At high signal (1.0), the BM baseline achieves RMSE 0.487 and pigauto achieves 0.507. Strong phylogenetic structure means the baseline captures most of the variation, and the calibrated gate stays near zero.

At low signal (0.2), baseline RMSE rises to 1.117 and pigauto achieves 1.069. With weak phylogenetic structure there is limited information for any method, but the GNN can still exploit cross-trait correlations.

Primary sweep: performance by phylogenetic signal (25% missingness)

Average across 4 traits and 5 replicates. ★ marks the best method per scenario.

SignalRMSE (lower is better)Pearson r (higher is better)
MeanBMpigautoMeanBMpigauto
Signal = 0.21.023 1.1171.0690.188 0.102
Signal = 0.41.0051.003 1.0250.368 0.231
Signal = 0.61.0130.842 0.8820.586 0.502
Signal = 0.80.9860.715 0.7420.698 0.660
Signal = 1.00.9690.487 0.5070.857 0.841
RMSE by phylogenetic signal 0.00 0.32 0.64 0.96 1.28 RMSE 1.023 1.117 1.069 Signal = 0.2 1.005 1.003 1.025 Signal = 0.4 1.013 0.842 0.882 Signal = 0.6 0.986 0.715 0.742 Signal = 0.8 0.969 0.487 0.507 Signal = 1.0 Phylogenetic signal Mean imputation BM baseline (Rphylopars) pigauto (BM + GNN)
Pearson r by phylogenetic signal 0.00 0.23 0.45 0.68 0.91 Pearson r 0.188 0.102 Signal = 0.2 0.368 0.231 Signal = 0.4 0.586 0.502 Signal = 0.6 0.698 0.660 Signal = 0.8 0.857 0.841 Signal = 1.0 Phylogenetic signal Mean imputation BM baseline (Rphylopars) pigauto (BM + GNN)

Secondary sweep: boundary density (signal = 0.6)

Proportion of values near 0 or 1. Higher boundary density makes the bounded [0,1] constraint more relevant.

RMSE by boundary density (signal = 0.6) 0.00 0.31 0.63 0.94 1.25 RMSE 0.980 0.812 0.823 Boundary = 0.0 0.996 0.956 0.998 Boundary = 0.1 0.981 1.089 0.986 Boundary = 0.3 Boundary density Mean imputation BM baseline (Rphylopars) pigauto (BM + GNN)

What the benchmark shows

Reproducibility

Driver: script/bench_proportion.R. Tree: ape::rtree(300). Traits: simulate_proportion_traits(). Training: 500 epochs with early stopping. To reproduce: Rscript script/bench_proportion.R, then Rscript script/make_bench_proportion_html.R.