> Entry(4)['SEQ']; MGAQQGKDRGAHSGGGGSGAPVSCIGLSSSPVASVSPHCISSSSGVSSAP ..(1520).. SLRQISNALNRThis statement returns a copy of what is found in Entry(4) of the Sample/SH2 database as a string.
However, it is sometimes convenient to instead reference a sequence in
the database rather than making a copying of it.
The Match structures which we explore in Chapter
- The Pairwise Comparison of Sequences require such references.
To reference a sequence in Darwin, we provide the offset of the
sequence from DB[string]. The Sequence structure
allows us do this easily. Given a Entry structure, it
returns the offset to the end of the opening <SEQ> tag for that
entry in the form of an unevaluated Sequence function call.
> Sequence(Entry(1)); Sequence(367) > Sequence(Entry(2)); Sequence(1338) > Sequence(Entry(76, 77, 78)); Sequence(74267,75202,76197)
We can combine the Entry, Offset and Sequence structured types to return the offset of the sequence contained in an entry given only an offset from DB[string].
> offset_from_DF := 45000: > entry_number := Entry(Offset(offset_from_DF)); entry_number := Entry(45) > seq := Sequence(entry_number); seq := Sequence(44545) > seq; Sequence(44545) > print(seq); MKERVKEMKVFGCRLNFWNHIGHEPDQFQNQRRQRRVLQPRIQRAAVSPNSSTTNSQ FSLQHNSSGSLGGGVGGGLGGGGSLGLGGGGGGGGSCTPTSLQPQSSLTTFKQSPTL LNGNGNLLDANMPGGIPTPGTPNSKAKDNSHFVKLVVALYLGKAIEGGDLSVGEKNA EYEVIDDSQEHWWKVKDALGNVGYIPSNYVQAEALLGLERYEWYVGYMSRQRAESLL KQGDKEGCFVVRKSSTKGLYTLSLHTKVPQSHVKHYHIKQNARCEYYLSEKHCCETI PDLINYHRHNSGGLACRLKSSPCDRPVPPTAGLSHDKWEIHPIQLMLMEELGSGQFG VVRRGKWRGSIDTAVKMMKEGTMSEDDFIEEAKVMTKLQHPNLVQLYGVCTKHRPIY IVTEYMKHGSLLNYLRRHEKTLIGNMGLLLDMCIQVSKGMTYLERHNYIHRDLAARN CLVGSENVVKVADFGLARYVLDDQYTSSGGTKFPIKWAPPEVLNYTRFSSKSDVWAY GVLMWEIFTCGKMPYGRLKNTEVVERVQRGIILEKPKSCAKEIYDVMKLCWSHGPEE RPAFRVLMDQLALVAQTLTD
Darwin offers a simpler way to find the offset of a sequence for an entry. Selecting on an Entry structure with option 'SequenceOffset' or, simply, 'SO' returns a Sequence structure containing the offset.
> Entry(1)['SO']; Sequence(367);