Skip to contents

Taxonomic revisions often split a single species into several or lump several into one. When your data and your reference taxonomy disagree on such cases, the reconciliation mapping will show one name in one source linked to multiple accepted names in the other. reconcile_splits_lumps() scans a reconciliation for these cases and returns them as two tibbles, one for splits and one for lumps, so you can decide how to handle each before running your PCM (e.g. keep only one of the split taxa, pool traits across a lumped set, or exclude them entirely).

Usage

reconcile_splits_lumps(reconciliation, quiet = FALSE)

Arguments

reconciliation

A reconciliation object built with a non-NULL authority argument. The function inspects the name_resolved column, which is only populated when synonym resolution was performed.

quiet

Logical. Suppresses the console summary when TRUE. Default FALSE.

Value

Invisibly, a list with two tibbles:

splits

Cases where one name in source x corresponds to multiple accepted names in source y.

lumps

Cases where several names in source x share a single accepted name in source y.

Details

Detection relies on the name_resolved column populated by synonym resolution — so authority must have been set (i.e. not NULL) when building the reconciliation.

Examples

# `reconcile_splits_lumps()` only surfaces rows that synonym lookup
# resolved (`match_type == "synonym"`), which requires `authority`
# to be non-NULL when building the reconciliation. The bundled-data
# call below uses `authority = NULL` for speed, so the output is
# empty:
data(avonet_subset)
data(tree_jetz)
rec <- reconcile_tree(avonet_subset, tree_jetz,
                      x_species = "Species1", authority = NULL,
                      quiet = TRUE)
sl <- reconcile_splits_lumps(rec, quiet = TRUE)
nrow(sl$splits); nrow(sl$lumps)   # 0 and 0
#> [1] 0
#> [1] 0

# To show what the output looks like when splits and lumps DO turn
# up, we hand-build a tiny reconciliation. In practice you would
# obtain this by calling reconcile_tree(..., authority = "col").
#
#   * Acanthiza pusilla (data) was split in CoL into A. pusilla and
#     A. apicalis  (1 x-name -> 2 y-names  ==>  split).
#   * Parus caeruleus and Cyanistes caeruleus (data: old + new names)
#     both map to Cyanistes caeruleus in CoL
#                 (2 x-names -> 1 y-name  ==>  lump).
demo_mapping <- tibble::tibble(
  name_x        = c("Acanthiza pusilla", "Acanthiza pusilla",
                    "Parus caeruleus",   "Cyanistes caeruleus"),
  name_y        = c("Acanthiza pusilla", "Acanthiza apicalis",
                    "Cyanistes caeruleus", "Cyanistes caeruleus"),
  name_resolved = c("Acanthiza pusilla", "Acanthiza pusilla",
                    "Cyanistes caeruleus", "Cyanistes caeruleus"),
  match_type    = "synonym",
  match_score   = 1,
  match_source  = "col",
  in_x          = TRUE,
  in_y          = TRUE,
  notes         = NA_character_
)
rec_demo <- structure(
  list(mapping   = demo_mapping,
       meta      = list(type = "data_tree", authority = "col"),
       counts    = list(),
       overrides = tibble::tibble()),
  class = "reconciliation"
)
sl <- reconcile_splits_lumps(rec_demo, quiet = TRUE)
sl$splits     # 1 row: Acanthiza pusilla split into 2 taxa
#> # A tibble: 1 × 6
#>   name_resolved     names_x   names_y     n_x   n_y type 
#>   <chr>             <list>    <list>    <int> <int> <chr>
#> 1 Acanthiza pusilla <chr [1]> <chr [2]>     1     2 split
sl$lumps      # 1 row: Parus + Cyanistes lumped into 1 taxon
#> # A tibble: 1 × 6
#>   name_resolved       names_x   names_y     n_x   n_y type 
#>   <chr>               <list>    <list>    <int> <int> <chr>
#> 1 Cyanistes caeruleus <chr [2]> <chr [1]>     2     1 lump