next up previous contents
Next: Point Accepted Mutations and Up: The DARWIN Manual © Previous: Calling External Functions

Darwin and Problems from Biochemistry

The second part of this book consists of a series of chapters, each of which is concerned with one particular problem from bioinformatics. The order of the chapters mirrors Darwin's tower of information. The tower is founded upon raw sequence data in the form of DNA or RNA. In terms of the biochemist's workbench, this is our raw material and it is with this that the bioinformatist's job begins. Each successive ``step up'' the tower represents a movement away from the raw unprocessed information and closer to an complete understanding of how the sequences act as a functional unit in the living organism.

At the top of our tower lies the secondary structure prediction. This, in some sense, is the ultimate goal for our system and the end of the bioinformatist's job. Our secondary structure predictions are based on multiple sequence alignments. These alignments indicate conserved/non-conserved areas of a protein and highlight the different structural units such as alpha helices and beta sheets. Of course, this implies that the accuracy of our structure predictions are dependent on the accuracy of our multiple sequence alignments. In turn, the accuracy of our alignments is dependent on how accurate our phylogenetic trees represent the true ancestral relationships between the species from which the sequences are taken. Our phylogenetic trees are constructed from the pairwise distances and variances deriven from the pairwise comparison of protein sequences. And, at the bottom of our tower, the protein sequences are extracted from the raw DNA or RNA supplied to us by the biochemist.

Of course, any mistake at any level of this tower percolates upwards. But, conversely, any improvement to an algorithm does too.

Figure: Darwin's Tower of Information

We do not claim that the solutions present herein are the only or the best way to go about solving any particular bioinformatics problem. The algorithms we have choosen to include in the Darwin libraries have strong arguments, both mathematical and biological, suggesting they will perform well in practice. However, there are other methods (requiring possibly unrealistic resource demands) that may be more pertinent to your particular situation and data. The strength of Darwin lies in the fact that any method (assuming it is algorithmic) can be programmed in the language.

Each of the following chapters contains:

Beyond the understanding of the Darwin libraries, we hope such a presentation gives users

next up previous contents
Next: Point Accepted Mutations and Up: The DARWIN Manual © Previous: Calling External Functions
Gaston Gonnet