#### Shifted gaps

Very often algorithms produce MSAs where gaps of the same size appear slightly shifted horizontally. From a biological point of view there are two possibilities: a) the gaps are in a loop. Since gaps appear frequently in loops, it is possible that two events produce indels of the same size at the same place. So it makes sense to group the gaps that are within a short distance for structural reasons. b) the gaps are not in a loop. Since gaps do not appear so frequently outside of loops, it is not very likely that two different events produce gaps of the same size at the same place. It is more likely that due to mutations at the ends of the gaps, the algorithm was forced to place the gap incorrectly. So in this case it also makes sense to group the gaps together. When gaps are grouped together, the number of evolutionary events is reduced (e.g gaps 1 and 1' from example ). This can be viewed as "pushing" and fusing the events up in the tree, if the corresponding tree is known (see Figure ). Since gaps are expensive (large negative score), reducing the number of indel events reduces the cost and therefore the score of the MSA will be increased in most cases.

Example 2.1:  In this example gaps 1 and 1' have the same size (length 3) and appear at almost the same place. When the gaps are grouped together, the score increases and the alignment makes more biological sense (if we ask an expert [23]).

Figure: When gaps of the same size but from different sequences are grouped together, this corresponds to "pushing" the event up in the evolutionary tree.
 ``` Score: 135.466 Score: 240.139 Maximum: 240.139 Maximum: 240.139 Event 1 1' Event 1 A: RP___CVCPVLRQ A: RPCVCP___VLRQ B: RPCACP___VLRQ B: RPCACP___VLRQ C: KPCVCPRQLVLRQ => C: KPCVCPRQLVLRQ D: KPCVCPRQLVLRQ D: KPCVCPRQLVLRQ E: KPCLCPKQAAVKQ E: KPCLCPKQAAVKQ ```

Chantal Korostensky
1999-07-14