Bioinformatics - Sequence Relationships: Similarity & Alignments

According to the concepts of inheritance and descent, living organisms can be traced back to forebears. According to the hypothesis of divergent evolution, going back further and further in time should allow to identify common ancestors for currently distinct species. Ultimately, all current life forms may be related to each other.

Support for this hypothesis can be found in our genes and proteins. Could there be any organisms more different than E. coli, lettuce, yeast, worms, flies, and humans? Yet, humans share genes and proteins with similar functions and sequences with animals, plants, and even bacteria. Since life in its diversity might relate back to one ancestral life form, all our genomes might relate back to the genome of this very same organism. The differences among contemporary genomes would have been introduced during the ensuing billions of years of genomic changes and evolution that led up to today's diversity. This divegent development of life is likened to a tree with emergencing new species and kingdoms representing the branching points of this tree of life.
The amount of similarity between two sequences is a measure for their relatedness. The relationships between nucleotide sequences can differ from the relationships between amino acid sequences which, in turn, may differ from the relationships in structure and function. Closely related sequences are usually more similar than more distantly related sequences. And similar sequences may be closer related than dissimilar sequences. Sequence similarity serves to estimate evolutionary distance following the assumption that sequence similarity that goes beyond the similarity which can be expected just by chance, indicates relatedness. The determination of sequence similarity is not trivial, though. It requires sophisticated computer algorithms which attempt to align sequences with each other in order to determine and score identities and differences between them. Aligning sequences is the basis for many research objectives such as finding genes, determining relationships, and finding sequences in databases. Understand how alignments work and how they are used to determine relationships between sequences and organisms.