next up previous contents
Next: Name Conversion List Up: Naming Conventions Previous: Built-In Functions

Library Functions

Library functions should be named according to the following convention.


(1) The name should consist of at most five parts.


(2) The verb should reflect the action in a meaningful way ie. Draw, Load, Save, Print. For performing string searches, the verb should be Search. If we are aligning sequences, it should be Align. If we are loading information from a file, it should be Load. If we are creating a graphics file, it should be Draw. If we require a generic verb (such as Create, Build, Do, Compute, Make, we recommend Make be chosen. For example, ReadRawFile loads a ``raw file''. This should be changed to LoadRawFile. The function DrawHistogram creates a histogram. This should be changed to MakeHistogram.

(3) The noun will typically be the object if you were to say the sentence completely. It will typically be a type or structured type. If the routine works on a particular type, then this type should be placed in the name as the noun. For example, ReadDb should be changed to LoadDatabase.

(4) A noun should be chosen that represents the generic object and is mathematical in nature ie. you would not choose DrawLarson but instead DrawTree.

(5) The adjective should only be included when its absence does not distinguish between the objective of two or more routines. There are currently two routines in Darwin to create phylogenetic trees: DarwUnrootedTree (unrooted) and DrawTree (rooted). We would choose the most generic mathematical name (Rule 4). This would lead to the choices DrawRootedTree and DrawUnrootedTree.

(6) The first and only the first letter of each word should be capitalised unless it is part of an abbreviation common in the biology/biochemistry literature ie. DNA, RNA, PAM and maybe AA. See list of abbreviations below.

(7) The adverb indicates a qualified action. For example, currently the function ApprxTextSearch looks for an approximate match of a string against a body of text. This should be changed to ApproximateSearchString (Approximate is the adverb, Search is the verb, String is the object type.)

(8) The domain is a special identifier used to indicate that the routine that follows (ie. the <adverb><verb><adjective><noun>) applies to a special type of object. There are only two special domains in Darwin: Inter Processor Communication abbreviated to IPC and Nuclear Peptide abbreviated to NucPep.

The domain is essentially only a short form for the name. The information could be encoded as adjectives and adverbs in the name of the function but this leads to very long names. As a concrete example, the routines GlobalAlign, LocalAlign, LocalAlignBestPam etc. are used to match amino acid sequences with amino acid sequences. Most of the routines have analogs in the the nucleotide/peptide matching arena. We would rename LocalAlignBestPam to AlignBestPamMatch according to the new naming conventions but when we enter the nucleotide/peptide setting this becomes AlignNucPepBestPamMatch which is a little too long (well over twenty letters) and rather cumbersome. Instead, we suggest NucPep_AlignPamMatch for various reasons. Now the same function name is retained except prefixed by the new domain. (It is still too long but this requires that ShakeAlignBestPamMatch be abbreviated, possibly ShakeAlignPamMatch.

The same problems arise with the IPC protocols.

(9) Abbreviations should be avoided. When function names are too long, the adverb and adjective should be the first to be abbreviated. All abbreviations should follow the $\star$ rule above.

(10) Underscore characters should be avoided except to separate <domain> from the rest of the name. Of course, underscore characters are need for polymorphism but this poses no problem with our conventions.

(11) Nouns should be singular.

(12) Functions which perform ``conversion'' require a bit of extra attention. In the worst case, we might require the following extension to the grammar for names.

For example, the Strings function takes an offset from DB[string] and returns the associated sequence from the database. First we would apply rule 7 to change Strings to String. In the full form, this would become OffsetToString. However, the type specification in the Strings function is offset : integer, so reasonably intelligent readers will immediately know that Strings takes an Offset and returns a String. Here we may leave out the OffsetTo entirely. When any ambiguity arises, we suggest the above <obj>To<obj> syntax.

next up previous contents
Next: Name Conversion List Up: Naming Conventions Previous: Built-In Functions
Gaston Gonnet