Evaluate imputation performance against known values — evaluate

Computes type-specific metrics for each trait on the validation and test splits. When a trait_map is supplied, metrics are dispatched per trait type; otherwise the function falls back to continuous-only metrics (RMSE, Pearson r, 95% coverage).

Usage

evaluate_imputation(pred, truth, splits, pred_se = NULL, trait_map = NULL)

Arguments

pred: predicted values: either a numeric matrix in latent scale (same dimensions as truth), or a "pigauto_pred" object from predict.pigauto_fit.
truth: numeric matrix of true values in latent scale (from pigauto_data$X_scaled).
splits: list (output of make_missing_splits).
pred_se: numeric matrix of prediction SEs (same scale as pred). Used for 95\ when pred is a pigauto_pred (uses pred$se).
trait_map: list of trait descriptors (from pigauto_data). If NULL and pred is not a pigauto_pred, the v0.1 all-continuous evaluation is used.

Value

A data.frame with columns split, trait, type, n, and type-specific metric columns.

Details

Metrics by trait type:

continuous: RMSE, Pearson r, 95\ supplied)
proportion: RMSE, Pearson r, 95\ supplied)
count: RMSE, MAE, Pearson r
ordinal: RMSE, Spearman rho
binary: Accuracy, Brier score
categorical: Accuracy
zi_count: RMSE, MAE, Pearson r, zero-accuracy, Brier score (on gate)

For binary and categorical traits the function accepts either a pigauto_pred object (preferred, gives access to probabilities) or raw matrices (latent scale).

Examples

if (FALSE) { # \dontrun{
# From pigauto_pred object
eval_df <- evaluate_imputation(pred_obj, pd$X_scaled, splits)

# From raw latent matrix
eval_df <- evaluate_imputation(bl$mu, pd$X_scaled, splits,
                                trait_map = pd$trait_map)
} # }