Next: Example
Up: Evaluation Measures of Multiple
Previous: Probability of an evolutionary
To illustrate how the scoring function can be used, a variety of
tools for generating MSAs were challenged with a set of protein
families simulated following a Markovian model of evolution, and
the outputs of each evaluated using the CS measure. This provides,
of course, only an approximate assessment of the MSA tools
themselves. A better assessment must come with actual experimental
sequence data.
Random trees with a given structure and edge lengths and a random
sequence at the root were generated. From this, sequence
mutations, insertions and deletions of different sizes were introduced according to
the length of the edges of the tree. At each internal node a new
sequence was thus generated. At the end of the simulation, only
the sequences at the leaves are retained. Since both the places of
insertions and deletions, as well as the ``real'' tree are known,
the correct MSA is known as well.
The retained sequences at the leaves can be given to different
algorithms (MSA[Gupta et al., 1996,Lipman et al., 1989], MAP[Huang, 1994],
ClustalW[Higgins and Sharp, 1989,Thompson et al., 1994] and the Probabilistic model
(PAS) [Gonnet and Benner, 1996,Gonnet, 1994a], and the score of the calculated MSAs
can be compared to the score of the ``real'' (generated) MSA using
the CS measure. The results for 3 combinations of trees and sizes
are shown. These are representative of all other results (Figures
12-14). The circular order was always derived
using a TSP algorithm, not with the generated tree. But since we
have the correct tree, it was easy to verify that the TSP order is
in fact a circular order (which was always the case).
Next: Example
Up: Evaluation Measures of Multiple
Previous: Probability of an evolutionary
Chantal Korostensky
1999-07-14