Convert a published taxonomy crosswalk into an overrides table
Source:R/reconcile_crosswalk.R
reconcile_crosswalk.RdTurn a curated species-name crosswalk (e.g. the BirdLife–BirdTree
crosswalk bundled as crosswalk_birdlife_birdtree, or Clements
updates released each year) into a data frame that can be passed
straight to the overrides argument of reconcile_tree(),
reconcile_data() and friends.
Usage
reconcile_crosswalk(
crosswalk,
from_col,
to_col,
match_type_col = NULL,
notes_col = NULL,
one_to_one_only = FALSE
)Arguments
- crosswalk
A data frame, or a file path. File format is inferred from the extension:
.csv(comma-separated),.tsv(tab-separated), or.txt(tab-separated). For other delimited formats, read the file yourself withread.delim()orread.table()and pass the resulting data frame.- from_col
A length-1 character vector. Column name for source names (e.g.,
"Species1"for BirdLife names).- to_col
A length-1 character vector. Column name for target names (e.g.,
"Species3"for BirdTree names).- match_type_col
A length-1 character vector or
NULL. Name of an optional column incrosswalkthat classifies each row's relationship between the two taxonomies — e.g."1BL to 1BT"(one BirdLife species mapped to one BirdTree species; a clean one-to-one match),"Many BL to 1BT"(a lump: several BirdLife species mapped to a single BirdTree species),"1BL to many BT"(a split). When supplied, the contents of this column are appended to each override'suser_noteso the audit trail records the relationship; if you also passone_to_one_only = TRUE, only the rows whose match type starts"1...to 1..."are kept. PassNULL(default) when your crosswalk has no such classification column — every row is then kept and notes carry no provenance label.- notes_col
A length-1 character vector or NULL. Column containing additional notes.
- one_to_one_only
Logical. If
TRUE, keeps only one-to-one matches (e.g.,"1BL to 1BT"). DefaultFALSE.
Value
A data frame with columns name_x, name_y, and
user_note, ready to be passed as the overrides argument.
Details
Using a crosswalk is preferable to automated synonym resolution when an authoritative mapping exists — it is reproducible, does not depend on taxadb being available, and you can point to the published source in the methods section of your paper.
See also
reconcile_override_batch() for applying this table
directly to an existing reconciliation; crosswalk_birdlife_birdtree
for the bundled example.
Other reconciliation functions:
reconcile_apply(),
reconcile_augment(),
reconcile_data(),
reconcile_diff(),
reconcile_export(),
reconcile_mapping(),
reconcile_merge(),
reconcile_multi(),
reconcile_override(),
reconcile_override_batch(),
reconcile_plot(),
reconcile_report(),
reconcile_review(),
reconcile_splits_lumps(),
reconcile_suggest(),
reconcile_summary(),
reconcile_to_trees(),
reconcile_tree(),
reconcile_trees()
Examples
data(crosswalk_birdlife_birdtree)
overrides <- reconcile_crosswalk(
crosswalk_birdlife_birdtree,
from_col = "Species1",
to_col = "Species3",
match_type_col = "Match.type"
)
#> ℹ 1933 many-to-one entries (lumps) included
#> ℹ 225 one-to-many entries (splits) included
#> ✔ Crosswalk: 3039 overrides (8079 identical pairs skipped)
head(overrides)
#> name_x name_y user_note
#> 1 Acanthidops bairdi Acanthidops bairdii crosswalk [1BL to 1BT]
#> 2 Acanthis flammea Carduelis flammea crosswalk [1BL to many BT]
#> 3 Acanthis flammea Carduelis hornemanni crosswalk [1BL to many BT]
#> 4 Acanthiza cinerea Gerygone cinerea crosswalk [1BL to 1BT]
#> 5 Acanthoptila nipalensis Turdoides nipalensis crosswalk [1BL to 1BT]
#> 6 Accipiter bicolor Accipiter chilensis crosswalk [1BL to many BT]