Definition 1.1
Given is a set of sequences
with
where
is a finite alphabet. A
Multiple
Sequence Alignment (MSA) consists of a set of sequences
with
where
.
.
The sequence obtained from
by removing all
gap characters is equal to
si.
Definition 1.2
The character
or any contiguous sequence of
within an aligned sequence
is called a
gap. A gap corresponds to an insertion or
deletion event (
indel).
Definition 1.6
Let
be the set of all possible MSAs that can be
generated for a given set of sequences
.
The
optimal MSA
is an MSA such that
w.l.o.g.
.
In some scoring functions the minimum is the optimal.