Skip to contents

Turn a curated species-name crosswalk (e.g. the BirdLife–BirdTree crosswalk bundled as crosswalk_birdlife_birdtree, or Clements updates released each year) into a data frame that can be passed straight to the overrides argument of reconcile_tree(), reconcile_data() and friends.

Usage

reconcile_crosswalk(
  crosswalk,
  from_col,
  to_col,
  match_type_col = NULL,
  notes_col = NULL,
  one_to_one_only = FALSE
)

Arguments

crosswalk

A data frame, or a file path. File format is inferred from the extension: .csv (comma-separated), .tsv (tab-separated), or .txt (tab-separated). For other delimited formats, read the file yourself with read.delim() or read.table() and pass the resulting data frame.

from_col

A length-1 character vector. Column name for source names (e.g., "Species1" for BirdLife names).

to_col

A length-1 character vector. Column name for target names (e.g., "Species3" for BirdTree names).

match_type_col

A length-1 character vector or NULL. Name of an optional column in crosswalk that classifies each row's relationship between the two taxonomies — e.g. "1BL to 1BT" (one BirdLife species mapped to one BirdTree species; a clean one-to-one match), "Many BL to 1BT" (a lump: several BirdLife species mapped to a single BirdTree species), "1BL to many BT" (a split). When supplied, the contents of this column are appended to each override's user_note so the audit trail records the relationship; if you also pass one_to_one_only = TRUE, only the rows whose match type starts "1...to 1..." are kept. Pass NULL (default) when your crosswalk has no such classification column — every row is then kept and notes carry no provenance label.

notes_col

A length-1 character vector or NULL. Column containing additional notes.

one_to_one_only

Logical. If TRUE, keeps only one-to-one matches (e.g., "1BL to 1BT"). Default FALSE.

Value

A data frame with columns name_x, name_y, and user_note, ready to be passed as the overrides argument.

Details

Using a crosswalk is preferable to automated synonym resolution when an authoritative mapping exists — it is reproducible, does not depend on taxadb being available, and you can point to the published source in the methods section of your paper.

Examples

data(crosswalk_birdlife_birdtree)
overrides <- reconcile_crosswalk(
  crosswalk_birdlife_birdtree,
  from_col = "Species1",
  to_col = "Species3",
  match_type_col = "Match.type"
)
#>  1933 many-to-one entries (lumps) included
#>  225 one-to-many entries (splits) included
#>  Crosswalk: 3039 overrides (8079 identical pairs skipped)
head(overrides)
#>                    name_x               name_y                  user_note
#> 1      Acanthidops bairdi  Acanthidops bairdii     crosswalk [1BL to 1BT]
#> 2        Acanthis flammea    Carduelis flammea crosswalk [1BL to many BT]
#> 3        Acanthis flammea Carduelis hornemanni crosswalk [1BL to many BT]
#> 4       Acanthiza cinerea     Gerygone cinerea     crosswalk [1BL to 1BT]
#> 5 Acanthoptila nipalensis Turdoides nipalensis     crosswalk [1BL to 1BT]
#> 6       Accipiter bicolor  Accipiter chilensis crosswalk [1BL to many BT]