Skip to contents

Compare the tip labels of two phylogenetic trees and report which species are shared, which differ only in formatting or synonymy, and which appear in only one of the two trees. Use this when assessing the impact of switching phylogenies (e.g., Jetz et al. 2012 vs Clements 2025) before deciding which tree to use in a downstream PCM.

Usage

reconcile_trees(
  tree1,
  tree2,
  authority = "col",
  rank = c("species", "subspecies"),
  overrides = NULL,
  db_version = NULL,
  fuzzy = FALSE,
  fuzzy_threshold = 0.9,
  resolve = c("flag", "first"),
  quiet = FALSE
)

Arguments

tree1

An ape::phylo object, or a character(1) path to a Newick/Nexus tree file.

tree2

An ape::phylo object, or a character(1) path to a Newick/Nexus tree file.

authority

A length-1 character vector, or NULL. Taxonomic authority used for synonym resolution (stage 3 of the cascade). One of:

"col" (default)

Catalogue of Life — broad, curated, frequently updated. A sensible default for most taxa.

"itis"

Integrated Taxonomic Information System — strong for North American vertebrates and plants.

"gbif"

Global Biodiversity Information Facility backbone. Wider coverage; includes more recent synonymy.

"ncbi"

NCBI Taxonomy — best when working with sequence data.

"ott"

Open Tree of Life synthetic taxonomy. Useful when your downstream phylogeny is from the Open Tree synthesis.

"itis_test"

A small bundled subset of ITIS, cached locally with taxadb for testing. Intended for examples and unit tests; not for analysis.

"gnverifier"

HTTP-backed verification against ~100 sources via the Global Names verifier; no local database download. See vignette("getting-started") for the trade-off (wider coverage, requires network and the httr2 package).

NULL

Skip the synonym stage entirely. Useful for quick checks or when taxadb is unavailable. Stages 1, 2 and 4 still run.

Five authority codes that earlier versions of the package advertised — "iucn", "tpl", "fb", "slb", "wd" — are no longer accepted. Empirical testing against taxadb v22.12 showed that iucn errors with a schema mismatch and the others are not taxadb providers at all. Passing one of those values now produces a helpful migration error.

rank

A length-1 character vector. Controls how trinomials are handled during normalisation:

"species" (default)

Strip infraspecific epithets so that "Parus major major" becomes "Parus major" before matching.

"subspecies"

Keep trinomials intact. Use this when your analysis operates at subspecies level.

overrides

Optional pre-built corrections. Either a data frame with at least columns name_x and name_y (plus an optional user_note column), or a file path to a CSV with the same columns. Any name listed here bypasses the cascade and is recorded as match_type = "manual". Useful for applying published crosswalks (see reconcile_crosswalk()) or for locking down decisions made in a previous run.

db_version

A length-1 character vector. taxadb database snapshot to use (e.g. "22.12"). NULL (default) uses the latest available.

fuzzy

Logical. Enables the fuzzy-matching stage when TRUE. Default FALSE. Turn this on to catch likely typos (Corvus brachyrhnchos -> Corvus brachyrhynchos). When FALSE, stages 1–3 still run.

fuzzy_threshold

Numeric in [0, 1]. Minimum genus-weighted similarity score for a fuzzy match to be accepted. Default 0.9 (roughly "no more than ~10% of characters differ"). Lower values (e.g. 0.7) are more permissive but produce more false positives; always review fuzzy matches with reconcile_suggest() or reconcile_review() before trusting them.

resolve

A length-1 character vector. What to do with borderline matches:

"flag" (default)

Mark low-confidence fuzzy matches (score below flag_threshold) and names with indirect taxadb synonymy as match_type = "flagged" so you can audit them with reconcile_review() or reconcile_suggest().

"first"

Accept the highest-scoring candidate silently, without flagging. Faster but riskier; use only when you have already reviewed the ambiguities.

quiet

Logical. Suppresses progress messages when TRUE. Default FALSE.

Value

A reconciliation object with meta$type == "tree_tree".

Examples

data(tree_jetz)
data(tree_clements25)
rec <- reconcile_trees(tree_jetz, tree_clements25, authority = NULL)
#>  Reconciling 657 tips (tree1) vs 854 tips (tree2)
#>  Matching 657 x 854 names through 2 stages...
#>  Stage 1/2: Exact matching...
#>  Stage 2/2: Normalised matching (641 matched so far)...
#>  Matched 641/657 tips between trees
rec
#> 
#> ── Reconciliation: tree vs tree ────────────────────────────────────────────────
#>   Source x: tree1 (657 tips)
#>   Source y: tree2 (854 tips)
#>   Authority: none
#>   Timestamp: 2026-06-16 10:10:00
#>  Match coverage: [█████████████████████████████░] 98% (641/657)
#> 
#> ── Match summary ──
#> 
#>  Exact: 641 (97.6%)
#>  Normalized: 0 ( 0.0%)
#>  Synonym: 0 ( 0.0%)
#>  Fuzzy: 0 ( 0.0%)
#>  Manual: 0 ( 0.0%)
#> ! Unresolved (x only):16 ( 2.4%)
#> ! Unresolved (y only):213
#> ! Flagged for review: 0
#>  Use `reconcile_summary()` for details, `reconcile_mapping()` for the full table.
# How many tips are shared across both trees?
sum(reconcile_mapping(rec)$in_x & reconcile_mapping(rec)$in_y)
#> [1] 641