Skip to contents

Computes a small set of standard metrics for comparing trees that come from different backends (or different runs of the same backend). Designed for the common case of "I retrieved a tree from rotl and another from fishtree — do they agree?"

Usage

pr_tree_compare(..., prune_to_common = TRUE)

Arguments

...

Two or more phylo objects, or two or more pr_tree_result objects (the tree slot is extracted), or multiPhylo objects (the first tree is used). Trees can be passed as positional arguments or as a named list.

prune_to_common

Logical. Restrict each tree to the shared tip set before computing topology metrics? Default TRUE — without this, RF distance is undefined when tip sets differ.

Value

A list with class pr_tree_compare and components:

n_trees

Number of input trees.

tip_sets

Named list of character vectors, one per tree.

shared_tips

Tips present in every input tree.

unique_to

Named list, one per tree, of tips present in that tree but not in every other tree.

n_shared

Length-1 integer.

pairwise_jaccard

Square matrix; (i, j) is the Jaccard index of tip_sets[[i]] vs tip_sets[[j]].

pairwise_rf

Square matrix of Robinson-Foulds distances between pairs of trees pruned to shared_tips. NA when the pair has < 4 shared tips.

pairwise_branch_cor

Square matrix of Pearson correlations between matching edge lengths in each pair, or NA when one or both trees have no branch lengths.

Details

RF distance is computed via ape::dist.topo() with the default method. Branch-length correlation matches edges by their tip-set bipartition: for each edge in tree A, the corresponding edge in tree B (if any) is the one that splits the same set of tips. The Pearson correlation is taken over the matched edge-length pairs; edges whose bipartition is absent in the other tree are dropped. This is a proper bipartition-matched correlation as introduced in Kuhner & Felsenstein (1994) for tree comparison.

References

Kuhner, M. K., & Felsenstein, J. (1994). A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. Molecular Biology and Evolution 11(3): 459–468. doi:10.1093/oxfordjournals.molbev.a040126

Robinson, D. F., & Foulds, L. R. (1981). Comparison of phylogenetic trees. Mathematical Biosciences 53(1–2): 131–147. doi:10.1016/0025-5564(81)90043-2

See also

pr_get_tree() for retrieval; reconcile_apply() for combining a chosen tree with a dataset.

Examples

# Two trees with identical tip sets
set.seed(1)
t1 <- ape::rtree(10)
t2 <- ape::rtree(10, tip.label = t1$tip.label)
cmp <- pr_tree_compare(t1, t2)
cmp$n_shared
#> [1] 10
cmp$pairwise_rf
#>       tree1 tree2
#> tree1     0    16
#> tree2    16     0

# Two trees with overlapping but not identical tips
t3 <- ape::rtree(8, tip.label = t1$tip.label[1:8])
cmp <- pr_tree_compare(t1, t3)
cmp$pairwise_jaccard
#>       tree1 tree2
#> tree1   1.0   0.8
#> tree2   0.8   1.0