Scipio

For many types of analyses, data about gene structure and locations of non-coding regions of genes are required. Although a vast amount of genomic sequence data is available, precise annotation of genes is lacking behind. Finding the corresponding gene of a given protein sequence by means of conventional tools is error prone, and cannot be completed without manual inspection which is time consuming and requires considerable experience.

Scipio is a tool based on the alignment program BLAT to determine the precise gene structure given a protein sequence and a genome. It identifies intron-exon borders and splice sites and is able to cope with sequencing errors and genes spanning several contigs in genomes that have not yet been assembled to supercontigs or chromosomes. Instead of producing a set of hits with varying confidence, Scipio gives the user a coherent summary of locations on the genome that code for the query protein. The output contains information about discrepancies that may result from sequencing errors. Scipio has also successfully been used to find homologous genes in closely related species.

Scipio on the web

References: 
  • O. Keller, F. Odronitz, M. Stanke, M. Kollmar & S. Waack (2008). Scipio: Using protein sequences to determine the precise exon/intron structures of genes and their orthologs in closely related species. BMC Bioinformatics 9, 278.
  • F. Odronitz, H. Pillmann, O. Keller, S. Waack & M. Kollmar (2008). WebScipio: An online tool for the determination of gene structures using protein sequences. BMC Genomics 9, 422.
Companion Programs: