Genomic alignment.

Hein J, Støvlbaek J

As sequencing techniques become increasingly efficient, the average length of a sequence is bound to grow. Traditional sequence-comparison algorithms can either compare DNA or protein, but not a mixture, which is actually a common situation. Most obtained DNA sequences contain coding regions, and it is more reliable to compare the coding regions as protein than just as DNA. A heuristic algorithm is presented that can compare DNA with both coding and noncoding regions, but that also can compare multiple reading frames and determine which exons are homologous. A program, GenA1 (Genomic Alignment), was developed that implements the algorithm. Its use is demonstrated on two retroviruses.

Keywords:

HIV-1

,

HIV-2

,

DNA, Viral

,

Sequence Alignment

,

Amino Acid Sequence

,

Base Sequence

,

Genes, gag

,

Genes, pol

,

Reading Frames

,

Algorithms

,

Software

,

Molecular Sequence Data