Skip to contents

When a reconciliation identifies species that are present in your data but missing from the tree, reconcile_augment() attaches each missing species as sister to a congener — i.e., a species in the same genus already present in the tree. The result is a tree that contains every species in your dataset, at the cost of making a strong assumption about where the new tips sit.

Usage

reconcile_augment(
  reconciliation,
  tree,
  where = c("genus", "near"),
  branch_length = c("congener_median", "half_terminal", "zero"),
  seed = NULL,
  quiet = FALSE,
  source = c("internal", "rtrees", "vphylomaker", "uphylomaker"),
  taxon = NULL,
  check_ultrametric = TRUE,
  ...
)

Arguments

reconciliation

A reconciliation object, typically from reconcile_tree().

tree

An ape::phylo object. Must be the same tree used to build reconciliation (or a tree with the same tip set). For source = "rtrees", this is passed to rtrees as the user-supplied backbone (tree_by_user = TRUE).

where

A length-1 character vector. Where to attach each new tip (only used when source = "internal"; ignored otherwise):

"genus" (default)

Attach as sister to a single congener chosen at random from the genus. Recommended when the genus has only one or two representatives in the tree, or when you want variation across runs for sensitivity analyses.

"near"

Attach at the most recent common ancestor (MRCA) of all congeners in the tree. Better when the genus is well-represented, because the new tip is not arbitrarily tied to one sister taxon.

branch_length

A length-1 character vector. How to set the terminal branch length of each newly added tip (only used when source = "internal"; ignored otherwise — rtrees sets its own branch lengths):

"congener_median" (default)

Median terminal branch length of the species' congeners. Uses the average "how long since this group diverged" for the genus. Recommended for time-calibrated trees because it preserves approximate branch-length scale.

"half_terminal"

Half the sister tip's terminal branch. A conservative alternative that places the new tip as a recent split from its sister. Useful when the genus is sparsely sampled and the median is unreliable.

"zero"

Zero-length branch, producing a polytomy with the sister taxon (or MRCA). Use for exploratory sensitivity checks where you want to see the effect of adding species without assuming any divergence time.

When the input tree is ultrametric, each grafted tip's terminal edge is adjusted after placement so the augmented tree stays ultrametric — a requirement of phylogenetic comparative methods. branch_length then governs the initial graft only; "zero" is exempt, since it asks for a polytomy by construction.

seed

A length-1 integer or NULL. When non-NULL and source = "internal", a fixed seed for the random congener choice when where = "genus", making the call reproducible. When NULL (default), the session's current RNG state is used so results vary across runs — useful for sensitivity analyses that explore the variation introduced by the random choice. Set to a fixed integer in real analyses so results are reproducible. The seed is scoped to this call: the session RNG state is saved before and restored after, so subsequent random draws in your script are unaffected. Default NULL. (For source = "rtrees", set the seed in your script before calling reconcile_augment(); rtrees does not accept a seed argument.)

quiet

Logical. Suppress progress messages? Default FALSE.

source

A length-1 character vector. Which grafting backend to use. One of "internal" (default), "rtrees", or "vphylomaker". See “Choosing a source”.

taxon

A length-1 character vector. Required when source = "rtrees". One of "bird", "mammal", "fish", "amphibian", "reptile", "plant", "shark_ray", "bee", "butterfly". Ignored for "internal" and "vphylomaker".

check_ultrametric

Logical. After grafting, check that the result is ultrametric and warn if not. Default TRUE. The "rtrees", "vphylomaker", and "uphylomaker" backends produce ultrametric trees by design; the "internal" backend does too when the input tree was ultrametric and branch_length is "congener_median" or "half_terminal", but not when branch_length = "zero" (which produces zero-length tip edges that break ultrametricity by construction).

...

Additional arguments forwarded to the chosen backend: rtrees::get_tree() for source = "rtrees" (e.g. scenario, n_tree); V.PhyloMaker2::phylo.maker() for source = "vphylomaker" (e.g. scenarios = "S3", nodes.type); U.PhyloMaker::phylo.maker() for source = "uphylomaker" (e.g. gen.list, scenario). Ignored when source = "internal".

Value

A list with:

tree

The augmented phylo object (or multiPhylo when source = "rtrees" returns a posterior sample).

original

The original (unmodified) phylo object, for easy comparison.

augmented

A tibble documenting each added species: species, genus, placed_near (sister tip / MRCA node / rtrees placement note), branch_length, method, n_congeners. For source = "rtrees", branch_length and n_congeners are NA because the backend chooses them.

skipped

A tibble of species that could not be placed, with the reason (e.g. "No congener in tree", "rtrees did not place this species").

meta

Provenance metadata: source, placement strategy, branch length rule, counts; for source = "rtrees" includes a backend_meta sub-list with the taxon and the number of grafted tips.

When to use this

Tip-grafting is an exploratory convenience, not a substitute for a properly inferred phylogeny. Both source modes (see below) make strong placement assumptions that are often wrong in detail. Use it to keep exploratory PCMs running while you decide how to handle orphan species, and always:

  1. Report exactly which species were augmented (see $augmented in the return value).

  2. Run sensitivity analyses with and without the augmented tips.

  3. Prefer a published imputed phylogeny (e.g. the PhyloMaker or TACT approaches) when grafting many species.

Choosing a source

"internal" (default)

Genus-level placement using only your tree (no external dependencies). Each missing species is attached as sister to a congener (or at the congeneric MRCA). Fast and reproducible, but only works when the genus is already represented in the tree, and assumes the new tip diverged in roughly the same way as its congeners.

"rtrees"

Delegates the grafting to the rtrees mega-tree machinery via rtrees::get_tree(tree_by_user = TRUE). Uses your tree as the backbone and lets rtrees place each missing species using genus / family information from a taxon-specific reference tree. Requires taxon and the GitHub-only rtrees package (https://daijiang.github.io/rtrees/). Helpful when the genus is absent from your tree but present in rtrees' reference — which the internal mode would skip.

"vphylomaker"

Plant-only alternative to "rtrees" via either of the GitHub packages V.PhyloMaker2 (https://github.com/jinyizju/V.PhyloMaker2, preferred when installed; updated and enlarged version) or V.PhyloMaker (https://github.com/jinyizju/V.PhyloMaker, used as a fallback; original 2019 version). Calls phylo.maker(sp.list, tree, scenarios = ...) with your tree as the backbone. Use this when you want explicit control over the V.PhyloMaker placement scenario ("S1", "S2", or "S3" — see Jin & Qian 2019/2022); otherwise "rtrees" with taxon = "plant" is simpler.

"uphylomaker"

Universal (plants + animals) variant of V.PhyloMaker, via the GitHub package U.PhyloMaker (https://github.com/jinyizju/U.PhyloMaker). Same phylo.maker convention but takes a gen.list (a genus-family lookup) so it can graft non-plant taxa as well as plants. Use this when your tree spans multiple kingdoms and you want the V.PhyloMaker placement strategy.

Use pr_get_tree() when you have only a species list and need a candidate tree from scratch (rotl, clootl, or rtrees). Use reconcile_augment() when you already have a tree and want to fill the gaps.

References

Paradis, E. & Schliep, K. (2019). ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35: 526–528. doi:10.1093/bioinformatics/bty633

Augmentation backends:

Jin, Y. & Qian, H. (2019). V.PhyloMaker: an R package that can generate very large phylogenies for vascular plants. Ecography 42(8): 1353–1359. doi:10.1111/ecog.04434 (source = "vphylomaker", fallback path.)

Jin, Y. & Qian, H. (2022). V.PhyloMaker2: an updated and enlarged R package that can generate very large phylogenies for vascular plants. Plant Diversity 44(4): 335–339. doi:10.1016/j.pld.2022.05.005 (source = "vphylomaker", preferred path.)

Jin, Y. & Qian, H. (2023). U.PhyloMaker: an R package that can generate large phylogenetic trees for plants and animals. Plant Diversity 45(3): 347–352. doi:10.1016/j.pld.2022.12.007 (source = "uphylomaker".)

See also

reconcile_tree() for the reconciliation step; reconcile_apply() for the non-augmenting alternative (prune data and tree to the intersection); pr_get_tree() for retrieving a candidate tree from external resources when you don't have a tree yet; pr_date_tree() for time-calibrating an existing topology; pr_cite_tree() for formatting tree provenance citations. The companion package pigauto consumes the resulting tree (or multiPhylo) directly via multi_impute_trees() for posterior-tree PCMs.

Other reconciliation functions: reconcile_apply(), reconcile_crosswalk(), reconcile_data(), reconcile_diff(), reconcile_export(), reconcile_mapping(), reconcile_merge(), reconcile_multi(), reconcile_override(), reconcile_override_batch(), reconcile_plot(), reconcile_report(), reconcile_review(), reconcile_splits_lumps(), reconcile_suggest(), reconcile_summary(), reconcile_to_trees(), reconcile_tree(), reconcile_trees()

Examples

# --- Example 1: genus-level placement with congener_median branch lengths ---
x <- data.frame(species = c("A a", "A missing", "B c", "C absent"))
tree <- ape::read.tree(text = "((A_a:1,A_b:1):1,B_c:2);")
result <- reconcile_tree(x, tree, x_species = "species",
                         authority = NULL, quiet = TRUE)

aug <- reconcile_augment(result, tree, seed = 42, quiet = TRUE)
#> Warning: Tree returned by "internal" is not strictly ultrametric.
#>  Most PCM methods (PGLS, BM, OU, etc.) assume ultrametric trees.
#> → To force: `phytools::force.ultrametric(result$tree)` or
#>   `ape::chronos(result$tree)`.
#>  To suppress this check: pass `check_ultrametric = FALSE`.

# Compare original vs augmented tree
cat("Original tips:", ape::Ntip(tree), "\n")
#> Original tips: 3 
cat("Augmented tips:", ape::Ntip(aug$tree), "\n")
#> Augmented tips: 4 
cat("Added:", nrow(aug$augmented), "| Skipped:", nrow(aug$skipped), "\n")
#> Added: 1 | Skipped: 1 

# Inspect which species were added and where they were placed
head(aug$augmented[, c("species", "genus", "placed_near",
                       "branch_length", "n_congeners")])
#> # A tibble: 1 × 5
#>   species   genus placed_near branch_length n_congeners
#>   <chr>     <chr> <chr>               <dbl>       <int>
#> 1 A missing A     A a                     1           2

# Species skipped (no congener in tree)
head(aug$skipped)
#> # A tibble: 1 × 3
#>   species  genus reason             
#>   <chr>    <chr> <chr>              
#> 1 C absent C     No congener in tree

# --- Example 2: MRCA placement with zero-length branches ---
aug_near <- reconcile_augment(result, tree,
                              where = "near",
                              branch_length = "zero",
                              seed = 42, quiet = TRUE)

cat("\nMRCA placement (zero branches):\n")
#> 
#> MRCA placement (zero branches):
cat("  Added:", nrow(aug_near$augmented), "\n")
#>   Added: 1 
# Compare: MRCA placement shows genus-level context
head(aug_near$augmented[, c("species", "placed_near", "method")])
#> # A tibble: 1 × 3
#>   species   placed_near      method
#>   <chr>     <chr>            <chr> 
#> 1 A missing MRCA of A a, A b near/0

if (FALSE) { # \dontrun{
  # --- Example 3: delegate grafting to rtrees ---
  # Useful when the genus is missing from your tree but present in
  # the rtrees taxon-specific reference tree.
  aug_rt <- reconcile_augment(result, tree,
                               source = "rtrees",
                               taxon  = "bird",
                               quiet  = TRUE)
  nrow(aug_rt$augmented)              # how many were placed
  aug_rt$meta$backend_meta$n_grafted  # how many at higher rank
} # }