After reconciling two datasets with reconcile_data(), use this function
to join them into a single analysis-ready data frame. The reconciliation
mapping table provides the species-level join key, so names that differ
between the two datasets (due to formatting, synonyms, or typos) are
correctly linked.
Arguments
- reconciliation
A reconciliation object (typically from
reconcile_data()).- data_x
The first data frame (source x in the reconciliation).
- data_y
The second data frame (source y in the reconciliation).
- species_col_x
A length-1 character vector. Species column in
data_x. Auto-detected ifNULL.- species_col_y
A length-1 character vector. Species column in
data_y. Auto-detected ifNULL.- how
A length-1 character vector. Join type:
"inner"(default): keep only species matched in both datasets."left": keep all species fromdata_x."full": keep all species from both datasets.
- suffix
A length-2 character vector. Suffixes to disambiguate columns with the same name in both datasets. Default
c("_x", "_y").- drop_unresolved
Logical. If
TRUE, rows wherespecies_resolvedisNA(i.e., species that could not be reconciled) are removed from the final result. DefaultFALSE(keep all rows, fill unmatched columns withNA). Only relevant forhow = "left"orhow = "full"; inner joins drop unmatched rows by definition.
Value
A data frame with a species_resolved column as the join
key, plus all columns from both datasets (with suffixes added when
column names collide).
Details
One row per species. reconcile_merge() works best when each dataset
has exactly one row per species. If a species appears in multiple rows
(e.g., sex-specific measurements, repeated populations), the merge
produces all pairwise combinations for that species—the same behaviour
as base merge(). To avoid unexpected row expansion, aggregate to one
row per species before merging, or be aware that the output will contain
more rows than either input.
Asymmetric datasets. When data_y contains many more species than
data_x (common when merging against a large reference database), use
how = "inner" or how = "left". Inner joins keep only the species
present in both datasets; left joins keep all data_x rows and fill
data_y columns with NA for unmatched species. Use how = "full"
only when you need to retain species unique to either side.
Recommended workflow for multi-row data. Reconcile using a
species-level summary (one row per species), inspect the mapping with
reconcile_mapping(), then join the mapping back to your full dataset
using the species column as key.
See also
reconcile_data() to build the reconciliation;
reconcile_apply() when you want aligned data + tree instead of a
single merged data frame.
Other reconciliation functions:
reconcile_apply(),
reconcile_augment(),
reconcile_crosswalk(),
reconcile_data(),
reconcile_diff(),
reconcile_export(),
reconcile_mapping(),
reconcile_multi(),
reconcile_override(),
reconcile_override_batch(),
reconcile_plot(),
reconcile_report(),
reconcile_review(),
reconcile_splits_lumps(),
reconcile_suggest(),
reconcile_summary(),
reconcile_to_trees(),
reconcile_tree(),
reconcile_trees()
Examples
data(avonet_subset)
data(nesttrait_subset)
rec <- reconcile_data(avonet_subset, nesttrait_subset,
x_species = "Species1",
y_species = "Scientific_name",
authority = NULL, quiet = TRUE)
merged <- reconcile_merge(rec, avonet_subset, nesttrait_subset,
species_col_x = "Species1",
species_col_y = "Scientific_name")
#> ✔ Merged 916 species (inner join)
cat(sprintf("Merged: %d rows, %d cols\n", nrow(merged), ncol(merged)))
#> Merged: 916 rows, 31 cols
head(merged[, c("species_resolved", "Family1", "Common_name")])
#> species_resolved Family1 Common_name
#> 1 Acanthagenys rufogularis Meliphagidae Spiny-cheeked Honeyeater
#> 2 Acanthiza apicalis Acanthizidae Inland Thornbill
#> 3 Acanthiza chrysorrhoa Acanthizidae Yellow-rumped Thornbill
#> 4 Acanthiza cinerea Acanthizidae Grey Thornbill
#> 5 Acanthiza ewingii Acanthizidae Tasmanian Thornbill
#> 6 Acanthiza inornata Acanthizidae Western Thornbill