Missingness mechanism benchmark: MCAR vs MAR vs MNAR
Tree: ape::rtree(300) ·
Traits: mixed-type per scenario ·
Mechanisms: MCAR · MAR (trait) · MAR (phylo) · MNAR ·
Methods: mean / mode · phylo baseline · pigauto ·
Replicates: 5 ·
Missingness: 25% ·
Commit d2663bdb21 ·
Run on 2026-05-11 10:54 ·
Total wall: 20.7 min
Bottom line. MNAR increases baseline RMSE by +7.4% relative to MCAR (0.594 vs 0.553). Non-random missingness violates the assumptions underlying phylogenetic imputation, but the phylogenetic signal still provides useful information.
Discrete accuracy under MCAR: baseline 72.3%, pigauto 73.8%. Under MNAR: baseline 75.6%, pigauto 75.6%. In this run the discrete rows mostly show pigauto matching the phylogenetic baseline rather than adding a separate advantage.
Primary sweep: metrics by missingness mechanism (25% missingness)
Average across traits and 5 replicates. Continuous traits report RMSE; discrete traits report accuracy. ★ marks the best method per scenario and metric.
Mechanism
Metric
Mean / mode
Phylo baseline
pigauto
MCAR
Avg RMSE (continuous)
1.002
0.553 ★
0.556
Avg accuracy (discrete)
0.476
0.723
0.738 ★
MAR (trait-driven)
Avg RMSE (continuous)
1.038
0.596
0.593 ★
Avg accuracy (discrete)
0.554
0.764 ★
0.752
MAR (phylo-clade)
Avg RMSE (continuous)
1.039
0.588 ★
0.594
Avg accuracy (discrete)
0.557
0.753 ★
0.714
MNAR
Avg RMSE (continuous)
1.051
0.594 ★
0.595
Avg accuracy (discrete)
0.508
0.756 ★
0.756
Secondary sweep: MAR severity
Continuous-trait RMSE as the MAR dependency strength (β) increases. Higher β means the probability of missingness depends more strongly on an observed covariate.
What the benchmark shows
MCAR is the reference setting. When missingness is completely at random, the missingness mechanism itself does not add value-dependent bias; use this row as the comparator for the MAR and MNAR scenarios.
MAR (trait-driven) introduces moderate difficulty. Missingness depends on an observed trait, creating non-random gaps. The phylogenetic baseline handles this well because the phylogenetic signal provides information orthogonal to the trait-driven missingness pattern.
MAR (phylo-clade) clusters gaps in the tree. When entire clades are missing, the phylogenetic baseline loses its closest informants. The resulting deltas are mixed by trait and metric in this run.
MNAR is statistically harder in principle. When missingness depends on the unobserved value itself, all imputation methods can be biased. The table shows how that principle played out in this particular simulation rather than a universal ranking.
pigauto mostly tracks the phylogenetic baseline across mechanisms. The calibrated gate often closes when the baseline is already strong; any GNN contribution should be read from the trait-level rows rather than assumed.
Reproducibility
Driver: script/bench_missingness_mechanism.R. Tree: ape::rtree(300). Training: 500 epochs with early stopping. To reproduce: Rscript script/bench_missingness_mechanism.R, then Rscript script/make_bench_missingness_mechanism_html.R.