Gaps and Trees

Next: Scoring Functions Up: Gaps and Trees Previous: Gaps and protein structure

Gaps and Trees

Assume we are given an optimal MSA $\mbox{\rm A}$ and assume further that we know the tree $T(\mbox{\rm A})$ . Each indel event can be mapped to exactly one edge in the tree (see Figure

). The result of a deletion event are gaps that appear in all sequences $a_i \in \mbox{\rm A}$ of the subtree where the event took place. In contrast, when the event was an insertion, gaps appear in all sequences that are not in the subtree below the event.

Example 1.1: In the alignment of Figure

, there are 4 indel events, which we call gap blocks (see Definition

). The first gap block consists of two gaps that appear in sequences A and B. If x is the ancestor of all the sequences, then the event must either be a deletion in the ancestor of sequences A and B, or it can be an insertion on the edge $x \rightarrow w$ . Accordingly, event 2 is an insertion in sequence D, event 3 represents a deletion in the ancestor of D and E, and finally event 4 is an insertion in sequence C.

Figure: Indel events are shown in the evolutionary tree (right) of the MSA (left).


Event     1         2        3         4
A: RPCVCP___VLRQAAQ__QVLQRQIIQGPQQLRRLF_AA
B: RPCACP___VLRQVVQ__QALQRQIIQGPQQLRRLF_AA
C: KPCLCPKQAAVKQAAH__QQLYQGQLQGPKQVRRAFRLL
D: KPCVCPRQLVLRQAAHLAQQLYQGQ____RQVRRAF_VA
E: KPCVCPRQLVLRQAAH__QQLYQGQ____RQVRRLF_AA

$\psfig{file=gaps/treegaps.eps,height=0.12\textheight,angle=0}$

Next: Scoring Functions Up: Gaps and Trees Previous: Gaps and protein structure

Chantal Korostensky
1999-07-14