Jump to content

Bioinformatics/Sequence Searches and Alignment

From Wikibooks, open books for an open world

Assume we obtained a sequence (DNA, RNA, polypeptide), by which we understand a string of one-letter code of the respective alphabet. This is what we call a query sequence, it is the input which we want to compare with stored sequences or to align with reference sequences. A search tool will parse this string and assign scores for every query/reference letter-pair. The highest scoring comparison will be put out by the search or alignment tool. Actually the search is itself an alignment. Or, if it has to be quick and dirty, we use a heuristic search, which means that we impose several assumptions to reduce the alignment algorithm complexity.