Mancheron et al. are attempting to re-work our current methods of genome comparison with their new program, QOD. The problems addressed by comparative genomics are annotating genes, inferring homology or orthology, and to reveal syntenic regions and rearrangement events; basically, comparative genomics are used to identify the genomic regions that are either shared or specific to individuals, strains, or species. There have been two main methods that been used to meet the needs of comparative genomics in the past. The first is the highly time consuming BLAST –hit approach, which is the method currently being used by members of the Lane Lab. This method is widely accepted but highly computationally demanding. In the lab many of our jobs take days to run, and even using the powerful Oscar ccv cluster at Brown University they can take hours to days to run. The second method commonly used is whole genome alignment which is, according to Mancheron et al, is a “highly computationally difficult optimization problem” which requires trained users and even when the complex method is done correctly may not lead to clear results. Mancheron et al set out to create a new method that was able to compute genome comparisons, and I believe they may have delivered it.
Mancheron et al use a novel concept called ‘maximum common intervals’, which they define as a genome region that cannot be extended and is shared across all genomes. This can be solved with a fast algorithm which yields a unique solution. First, they prepare the input for their algorithm. They are given a target genome and any number of reference genomes. For each reference genome, all local pairwise similarities whose statistical significance lies above a user defined threshold are returned as a set of short genomic intervals, paired up so that one sequence from the reference is paired up with one sequence from the target. An interval is considered to be ‘common’ if every reference genome has a copy of that interval. A common interval is considered to be the maximum common interval if there do not exist any larger common intervals on that region.
Mancheron et al. demonstrated the capabilities of their new technique by comparing the three available strains of E. ruminantium. They were able to identify what percent of each strain had common interval in the other strains and returned the genes that were not common.
Because of the ease of its GUI and the speed at which it can operate, I believe that this program could overtake BLAST and whole genome alignments in the near future. Regardless, it is worth looking into if you are doing genome comparisons.
Mancheron A, Uricaru R, Rivals E. An alternative approach to multiple genome comparison. LIRMM. Nucleic Acids Res. 2011 Jun 6, 2011.