Agreement forest

Short description: Mathematical concept

In the mathematical field of graph theory, an agreement forest for two given (leaf-labeled, irreductible) trees is any (leaf-labeled, irreductible) forest which can, informally speaking, be obtained from both trees by removing a common number of edges.

Agreement forests first arose when studying combinatorial problems related to computational phylogenetics, in particular tree rearrangements.^[1]

Preliminaries

Recall that a tree (or a forest) is irreductible when it lacks any internal node of degree 2. In the case of a rooted tree (or a rooted forest), the root(s) are of course allowed to have degree 2, since they are not internal nodes. Any tree (or forest) can be made irreductible by applying a sequence of edge contractions.

An irreductible (rooted or unrooted) tree $T$ whose leaves are bijectively labeled by elements of a set $X$ is called a (rooted or unrooted) $X$ -tree. Such a $X$ -tree usually model a phylogenetic tree, where the elements of $X$ (the taxon set) could represent species, individual organisms, DNA sequences, or other biological objects.

Two $X$ -trees $T 1$ and $T 2$ are said to be isomorphic when there exists a graph isomorphism between them which preserves the leaf labels. In the case of rooted $X$ -trees, the isomorphism must also preserves the root.

Given a $X$ -tree $T$ and a taxon subset $Y \subseteq X$ , the minimal subtree of $T$ that connects all leaves in $Y$ is denoted by $T(Y)$ . When $T$ is rooted, then $T(Y)$ is also rooted, with its root being the node closest to the original root of $T$ . This $T(Y)$ subtree needs not be a $Y$ -tree, because it might not be irreductible. We therefore further define the restricted subtree $T|Y$ , which is obtained from $T(Y)$ by suppressing all internal nodes of degree 2, yielding a proper $Y$ -tree.

Agreement forests

An agreement forest for two unrooted $X$ -trees $T 1$ and $T 2$ is a partition ${X 1, X 2, ..., X k$ } of the taxon set $X$ satisfying the following conditions:

$T 1 |X i$ and $T 2 |X i$ are isomorphic for every $i \in {1,2,...,k}$ and
the subtrees in ${ T 1 (X i) : i \in {1,2,...,k} }$ and ${ T 2 (X i) : i \in {1,2,...,k} }$ are vertex-disjoint subtrees of $T 1$ and $T 2$ , respectively.

The set partition ${X 1, X 2, ..., X k$ } is identified with the forest of restricted subtrees $F = {T|X 1, T|X 2, ..., T|X k$ }, with either $T=T 1$ or $T=T 2$ (the choice of it begin irrelevant because of condition 1). Therefore, an agreement forest can either be seen as a partition of the taxon set $X$ or as a forest (in the classical graph-theoretic sense) of restricted subtrees.

The size of an agreement forest is simply its number of components. Intuitively, an agreement forest of size $k$ for two phylogenetic trees is a forest which can be obtained from both trees by removing $(k-1)$ edges in each tree and subsequently suppressing internal nodes of degree $2$ .

Rooted case

Acyclic agreement forests

A raffinement on the above definition can be made, resulting in the concept of acyclic agreement forest. An agreement forest $F$ for two $X$ -trees $T 1$ and $T 2$ is said to be acyclic if each of its tree components can be numbered in such a way that if the root of one component $X i \in F$ is an ancestor of the root of another component $X j \in F$ in either $T 1$ or $T 2$ , then the number assigned to $X i$ is lower than the number assigned to $X j$ .

Another characterization of acyclicity in agreement forest is to consider the directed graph $G F$ that has vertex set $F$ and a directed edge $(X i, X j)$ if and only if $i \neq j$ and at least one of the two following conditions hold:

the root of $T 1 (X i)$ is an ancestor of the root of $T 1 (X j)$ in $T 1$
the root of $T 2 (X i)$ is an ancestor of the root of $T 2 (X j)$ in $T 2$

The directed graph $G F$ is called the inheritance graph associated with the agreement forest $F$ , and we call $F$ acyclic if $G F$ has no directed cycle.

Optimization problems

A (rooted, unrooted, acyclic) agreement forest $F$ for $T 1$ and $T 2$ is said to be maximum if it contains the smallest possible number of elements (i.e. it has the smallest size). In this context, it is the agreement between the two trees which is maximized: it explains why computing a maximum agreement forest actually means minimizing its number of components. This leads to two different (but related) optimization problems. In both cases, we choose to minimize $| F | - 1$ rather than $| F |$ , because the former corresponds to the number of cuts to be done in each tree in order to obtain $F$ .

maximal ≠ maximum
unrooted MAF corresponds to TBR
rooted MAF corresponds to rSPR
acyclic MAF corresponds to HYB
AFs can be defined on non-binary trees
AFs can be defined on more than two trees
acyclic agreement forests have a role to play in the computation of HYB on 3 or more trees, but the relationship is much weaker than in the case of 2 trees
Complexity
FPT algorithms
Approximation algorithms
Exponential time algorithms

Notes

↑ Jotun Hein; Tao Jiang; Lusheng Wang; Kaizhong Zhang (1996). "On the complexity of comparing evolutionary trees". Discrete Applied Mathematics 71 (1–3): 153–169. doi:10.1016/S0166-218X(96)00062-5.

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Agreement forest. Read more

[complexity1996-1] Jotun Hein; Tao Jiang; Lusheng Wang; Kaizhong Zhang (1996). "On the complexity of comparing evolutionary trees". Discrete Applied Mathematics 71 (1–3): 153–169. doi:10.1016/S0166-218X(96)00062-5.

[1]

Anonymous

Search

Agreement forest

Namespaces

More

Page actions

Contents

Preliminaries

Agreement forests

Rooted case

Acyclic agreement forests

Optimization problems

Notes

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Agreement forest

Preliminaries

Agreement forests

Rooted case

Acyclic agreement forests

Optimization problems

Notes

Navigation

Wiki tools

Page tools

Other projects

Categories