Help & Documentation |
EGPRED server uses four different programs: (i) Genscan; (ii) HMMgene; (iii) BLAST (BLASTX & BLASTN); and (iv) SPUTNIK. It also uses three PERL scripts developed by Sanja Rogic (Rogic et al, 2002). These are the (a) Exon Union-Intersection (or EUI); (b) EUI frame; & (c) GI methods.
EGPRED takes the default parameters for all programs as has been reported by the authors to be best for the prediction of protein coding genes of eukaryotic genomic DNA. However, those users who wish to use the different programs to combine using EGPRED strategy in a way they seem best suited to them, EGPRED allows them to do so. This is available to the user by selecting various parameters of different programs.
Recently, Rogic et al, (2002) developed 3 PERL scripts that combined the output from two different ab initio gene prediction programs, GENSCAN and HMMgene in three different ways. These are the:
The server only takes FASTA formatted sequences. The sequence can be uploaded by directly pasting the sequences onto the field provided for the same or upload a sequence file using the browse function in the sumbit form of the EGPRED server.
The EGPRED server outputs a Colored GIF image that graphically displays the predicted protein coding regions obtained from different programs. The image is hyperlinked to the original results from these individual programs so that users can look at their results in tabular format.
This is optional. Users are provided the E-mail reply facility so that the EGPRED server intimates the user of the completion of their query and provides a link to the result page.
Users can restrict the analysis of their sequence to a particular region (say 1010 to 3100 in a 4500 bp sequence). This helps in quick work as repeated cut and paste operations are avoided.
The FASTA Format is a well known format to represent biomolecular sequences in an ordered way. It has a initial single sequence identifier line begining with a ">" tag. This line is a comment line and usually describes the sequence itself. Begining from the second line is the sequence itself till another ">" tag appears in the begining of any other line or the sequence ends.
>Sequence 1 ID Field as Example ACGTGTTTTT.... TGGCACCGG..... TGCCCCAC...... >Sequence 2 ID Field TGGCACTTTT.... ACGCCCGGC..... TCAGTGTC......