next up previous
Next: Maximum Likelihood Up: Introduction Previous: Parsimony

Distance Matrix Methods

Distance matrix methods fit a tree to a matrix of pairwise distances between the sequences. Most distance methods use some form of weighted or unweighted least squares measure. Given are the distances dij. The problem is to find distances $\delta(i,j)$ such that the score of the tree is minimized. The score of a tree is often defined as:

\begin{displaymath}F(T) = \sum_{i < j} \frac{(d_{ij} - \delta(i,j))^2}{d^p_{ij}}
\end{displaymath} (1)

where where p can be set to 0, 1 or 2. The methods assume that the variance of the measurement error is proportional to the p-th power of the expectation. The only rigorous way to get an optimal solution is by trying out all tree topologies. For a given tree topology, this least squares problem leads to a system of linear equations which can be solved using standard methods of numerical linear algebra. However, finding the optimal topology is generally an intractable problem, since the number of tree topologies grows exponentially with the number of nodes.

Chantal Korostensky
1999-07-14