For various purposes, including the estimation of PAM distances
between two alinged sequences, it is neceassry to use an array of
Dayhoff matrices computed for a range of PAM distances.
Chapter - The Pairwise Comparison of Amino Acid
Sequences discusses this estimation in greater depth.
The Darwin function CreateDayMatrices computes such a range of ``enhanced''
Dayhoff matrices.
Returns: DayMatrix
Synposis: This function performs the following three
calculations:
(1) It assigns a Dayhoff matrix computed at PAM distance 250 to
the system variable DM.
(2) It computes 1266 Dayhoff matrices for various PAM distances
between 0.049 and 1000 and assigns the list of such matrices
to the system variable DMS.
The PAM distances for these matrices are not restricted to integers. The CreateDayMatrices function produces a large number of matrices at low PAM.
> CreateDayMatrices(); > for i from 1 to length(DMS) do > printf(' %5.5f', DMS[i][PamNumber]); > od; 0.04945 0.05055 0.05167 0.05282 0.05399 0.05519 0.05642 0.05767 0.05896 0.06027 0.06161 0.06297 0.06437 0.06580 0.06727 0.06876 0.07029 0.07185 0.07345 0.07508 0.07675 0.07845 0.08020 0.08198 0.08380 0.08566 0.08757 0.08951 0.09150 0.09354 0.09561 0.09774 0.09991 0.10213 0.10440 0.10672 0.10909 0.11152 0.11400 0.11653 0.11912 0.12176 0.12447 0.12724 0.13006 0.13295 0.13591 0.13893 0.14202 0.14517 0.14840 0.15170 0.15507 0.15851 0.16204 0.16564 0.16932 0.17308 0.17693 0.18086 0.18488 0.18899 0.19319 0.19748 0.20187 0.20635 0.21094 0.21563 0.22042 0.22532 0.23032 0.23544 0.24067 0.24602 0.25149 0.25708 0.26279 0.26863 0.27460 0.28070 0.28694 0.29332 0.29983 0.30650 ...
Comparing the 250-PAM original Dayhoff matrix with the 250-PAM ``enhanced'' Dayhoff matrix reveals that, although similar, the matrices have significant differences.
> OrigTot := [87, 41, 40, 47, 33, 38, 50, 89, 34, 37, > 85, 81, 15, 40, 51, 70, 58, 10, 30, 65]; > OrigFreq := OrigTot/sum(OrigTot); > OrigDM := CreateOrigDayMatrix(Mutations1978, OrigFreq, 250);
> print(OrigDM); DayMatrix(Peptide, pam=250, Sim: max=17.302, min=-7.510, max offdiag=6.951, del=-19.814-1.396*(k-1)) C 12.0 S -0.0 1.6 T -2.2 1.3 2.6 P -2.7 0.9 0.3 5.9 A -2.0 1.1 1.2 1.1 1.8 G -3.3 1.1 -0.0 -0.5 1.3 4.8 N -3.6 0.7 0.4 -0.5 0.2 0.4 2.0 D -5.1 0.3 -0.1 -1.0 0.3 0.6 2.1 3.9 E -5.3 -0.0 -0.4 -0.6 0.3 0.2 1.4 3.4 3.9 Q -5.3 -0.5 -0.8 0.2 -0.4 -1.2 0.8 1.6 2.5 4.1 H -3.4 -0.8 -1.3 -0.3 -1.4 -2.1 1.6 0.7 0.6 2.9 6.6 R -3.6 -0.3 -0.9 -0.2 -1.6 -2.6 -0.0 -1.3 -1.1 1.2 1.5 6.1 K -5.4 -0.2 -0.0 -1.2 -1.2 -1.7 1.0 0.1 -0.1 0.7 -0.1 3.4 4.7 M -5.2 -1.6 -0.6 -2.1 -1.2 -2.8 -1.8 -2.6 -2.2 -1.0 -2.2 -0.5 0.4 6.6 I -2.3 -1.4 0.1 -2.0 -0.5 -2.6 -1.8 -2.4 -2.0 -2.0 -2.5 -2.0 -1.9 2.2 4.6 L -6.0 -2.8 -1.7 -2.6 -1.9 -4.0 -2.9 -4.0 -3.3 -1.8 -2.1 -3.0 -2.9 3.7 2.4 6.0 V -1.9 -1.0 0.3 -1.2 0.2 -1.4 -1.8 -2.2 -1.8 -1.9 -2.3 -2.5 -2.5 1.8 3.7 1.8 4.3 F -4.3 -3.2 -3.1 -4.6 -3.5 -4.8 -3.5 -5.6 -5.4 -4.7 -1.8 -4.5 -5.3 0.2 1.0 1.8 -1.2 9.1 Y 0.4 -2.8 -2.8 -5.0 -3.5 -5.3 -2.1 -4.3 -4.3 -4.0 -0.1 -4.2 -4.5 -2.5 -1.0 -0.9 -2.5 7.0 10.2 W -7.5 -2.3 -5.0 -5.5 -5.6 -6.8 -3.9 -6.6 -6.8 -4.6 -2.5 2.3 -3.3 -4.1 -5.0 -1.7 -6.1 0.5 0.0 17.3 > print(DM); DayMatrix(Peptide, pam=250, Sim: max=14.152, min=-5.161, max offdiag=5.080, del=-19.814-1.396*(k-1)) C 11.5 S 0.1 2.2 T -0.5 1.5 2.5 P -3.1 0.4 0.1 7.6 A 0.5 1.1 0.6 0.3 2.4 G -2.0 0.4 -1.1 -1.6 0.5 6.6 N -1.8 0.9 0.5 -0.9 -0.3 0.4 3.8 D -3.2 0.5 -0.0 -0.7 -0.3 0.1 2.2 4.7 E -3.0 0.2 -0.1 -0.5 -0.0 -0.8 0.9 2.7 3.6 Q -2.4 0.2 0.0 -0.2 -0.2 -1.0 0.7 0.9 1.7 2.7 H -1.3 -0.2 -0.3 -1.1 -0.8 -1.4 1.2 0.4 0.4 1.2 6.0 R -2.2 -0.2 -0.2 -0.9 -0.6 -1.0 0.3 -0.3 0.4 1.5 0.6 4.7 K -2.8 0.1 0.1 -0.6 -0.4 -1.1 0.8 0.5 1.2 1.5 0.6 2.7 3.2 M -0.9 -1.4 -0.6 -2.4 -0.7 -3.5 -2.2 -3.0 -2.0 -1.0 -1.3 -1.7 -1.4 4.3 I -1.1 -1.8 -0.6 -2.6 -0.8 -4.5 -2.8 -3.8 -2.7 -1.9 -2.2 -2.4 -2.1 2.5 4.0 L -1.5 -2.1 -1.3 -2.3 -1.2 -4.4 -3.0 -4.0 -2.8 -1.6 -1.9 -2.2 -2.1 2.8 2.8 4.0 V -0.0 -1.0 0.0 -1.8 0.1 -3.3 -2.2 -2.9 -1.9 -1.5 -2.0 -2.0 -1.7 1.6 3.1 1.8 3.4 F -0.8 -2.8 -2.2 -3.8 -2.3 -5.2 -3.1 -4.5 -3.9 -2.6 -0.1 -3.2 -3.3 1.6 1.0 2.0 0.1 7.0 Y -0.5 -1.9 -1.9 -3.1 -2.2 -4.0 -1.4 -2.8 -2.7 -1.7 2.2 -1.8 -2.1 -0.2 -0.7 -0.0 -1.1 5.1 7.8 W -1.0 -3.3 -3.5 -5.0 -3.6 -4.0 -3.6 -5.2 -4.3 -2.7 -0.8 -1.6 -3.5 -1.0 -1.8 -0.7 -2.6 3.6 4.1 14.2
To extract an entry of DMS with a particular PAM distance, we use
the function.
SearchDayMatrix(p : real, D : array(DayMatrix))
where p is the target PAM distance and D is the array of Dayhoff
matrices.
> print(SearchDayMatrix(250, DMS)); DayMatrix(Peptide, pam=250, Sim: max=14.152, min=-5.161, max offdiag=5.080, del=-19.814-1.396*(k-1)) ...It is equivalent to the matrix assigned to variable DM.
It is interesting to examine how these matrices change from low PAM distance where the matrix is positive on the diagonal and very negative in the off-diagonal to high PAM distance where the entries smooth out.
> print(DMS[1]); DayMatrix(Peptide, pam=0.0494497, Sim: max=18.839, min=-50.808, max offdiag=-24.377, del=-47.348-1.396*(k-1)) C 17.2 S -31.5 12.2 T -34.6-25.7 12.1 P -46.5-31.6-32.5 13.4 A -31.1-27.3-30.5-31.8 11.1 G -38.3-31.8-38.4-37.9-31.3 11.3 N -37.2-28.6-30.6-37.1-35.4-32.2 13.4 D -45.3-31.8-33.1-35.8-34.3-33.5-27.0 12.7 E -48.7-32.4-33.8-34.7-31.7-36.8-32.6-25.8 12.3 Q -41.8-31.4-32.0-32.7-32.4-35.9-30.4-31.8-26.2 14.2 H -35.2-33.3-32.8-35.8-35.2-37.1-28.4-32.5-32.4-27.3 16.3 R -37.3-33.5-33.7-36.2-34.8-35.5-33.4-37.5-34.1-28.2-31.3 12.7 K -46.6-32.6-31.4-34.8-33.9-37.0-29.7-33.5-29.0-27.2-31.6-25.6 12.3 M -34.6-34.7-32.3-44.2-32.4-40.7-38.4-45.9-36.7-30.8-34.2-37.3-33.7 16.4 I -38.6-39.3-33.4-40.6-38.1-47.8-38.6-46.9-40.3-38.9-38.4-40.0-37.3-26.4 12.4 L -37.6-38.8-37.0-36.9-35.4-44.1-40.8-48.5-40.5-34.0-37.5-37.1-37.4-25.9-27.3 10.3 V -32.4-37.4-30.1-37.7-29.7-42.6-40.3-47.1-35.7-37.0-39.6-37.8-37.2-30.6-24.4-30.1 11.6 F -35.0-41.1-37.8-42.6-38.2-45.3-39.5-46.2-45.6-39.4-33.6-44.2-43.4-29.5-32.5-29.3-34.8 13.9 Y -34.1-35.0-37.8-39.7-38.8-42.7-34.6-38.9-40.2-37.2-27.2-36.1-37.7-35.7-36.8-35.5-36.0-24.7 14.8 W -35.5-38.4-41.7-45.2-41.6-39.6-41.4-50.8-42.6-37.1-35.9-34.5-43.7-35.9-38.4-35.9-41.7-28.9-28.2 18.8 C S T P A G N D E Q H R K M I L V F Y W > print(DMS[length(DMS)]); DayMatrix(Peptide, pam=1000, Sim: max=3.184, min=-0.454, max offdiag=0.882, del=-15.338-1.396*(k-1)) C 0.9 S -0.0 0.0 T -0.0 0.0 0.0 P -0.1 0.1 0.0 0.3 A 0.0 0.0 0.0 0.0 0.0 G -0.1 0.1 0.0 0.1 0.1 0.5 N -0.1 0.1 0.0 0.1 0.0 0.1 0.1 D -0.1 0.1 0.0 0.1 0.0 0.2 0.1 0.2 E -0.1 0.1 0.0 0.1 0.0 0.1 0.1 0.1 0.1 Q -0.1 0.0 0.0 0.0 0.0 0.1 0.1 0.1 0.1 0.1 H -0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 R -0.1 0.0 0.0 0.0 0.0 0.1 0.1 0.1 0.1 0.1 0.0 0.1 K -0.1 0.0 0.0 0.1 0.0 0.1 0.1 0.1 0.1 0.1 0.0 0.1 0.1 M 0.0 -0.1 -0.0 -0.1 -0.0 -0.2 -0.1 -0.2 -0.1 -0.1 -0.0 -0.1 -0.1 0.2 I 0.0 -0.1 -0.0 -0.1 -0.0 -0.3 -0.1 -0.2 -0.1 -0.1 -0.1 -0.1 -0.1 0.2 0.2 L 0.0 -0.1 -0.0 -0.1 -0.1 -0.3 -0.1 -0.2 -0.2 -0.1 -0.1 -0.1 -0.1 0.2 0.3 0.3 V 0.1 -0.1 -0.0 -0.1 -0.0 -0.2 -0.1 -0.1 -0.1 -0.1 -0.0 -0.1 -0.1 0.1 0.2 0.2 0.1 F 0.1 -0.2 -0.1 -0.2 -0.1 -0.4 -0.2 -0.3 -0.2 -0.1 -0.0 -0.2 -0.2 0.2 0.3 0.3 0.2 0.6 Y 0.1 -0.1 -0.1 -0.2 -0.1 -0.3 -0.1 -0.2 -0.2 -0.1 0.0 -0.1 -0.1 0.2 0.2 0.2 0.1 0.5 0.5 W 0.1 -0.2 -0.2 -0.4 -0.2 -0.5 -0.3 -0.4 -0.3 -0.2 0.0 -0.2 -0.3 0.2 0.2 0.2 0.1 0.9 0.9 3.2 C S T P A G N D E Q H R K M I L V F Y W