Help & Documentation
[HOME] [SUMBISSION FORM] [CONTACT] [TEAM] [UPDATES] [HELP] [RESULTS]
DNAHome Page

  1. Programs Used in EGPRED:

    EGPRED server uses four different programs: (i) Genscan; (ii) HMMgene; (iii) BLAST (BLASTX & BLASTN); and (iv) SPUTNIK. It also uses three PERL scripts developed by Sanja Rogic (Rogic et al, 2002). These are the (a) Exon Union-Intersection (or EUI); (b) EUI frame; & (c) GI methods.

  2. Program Choices:

    EGPRED takes the default parameters for all programs as has been reported by the authors to be best for the prediction of protein coding genes of eukaryotic genomic DNA. However, those users who wish to use the different programs to combine using EGPRED strategy in a way they seem best suited to them, EGPRED allows them to do so. This is available to the user by selecting various parameters of different programs.

  3. Similarity Search (BLASTX + BLASTN) Parameters:

    Users are provided with options for changing parameters for type of matrices, word length, database used, and importantly parsing the HSPs as predictions for exons/introns. By default, the input sequence is filtered for low-complexity regions. Also, the BLAST programs is run for only the forward strand thereby saving time. For the BLASTN program, the EGPRED by default undertakes an ungapped alignment.

  4. PERL scripts developed by Sanja Rogic:

    Recently, Rogic et al, (2002) developed 3 PERL scripts that combined the output from two different ab initio gene prediction programs, GENSCAN and HMMgene in three different ways. These are the:

  5. Sequence Input Format:

    The server only takes FASTA formatted sequences. The sequence can be uploaded by directly pasting the sequences onto the field provided for the same or upload a sequence file using the browse function in the sumbit form of the EGPRED server.

  6. EGPRED output:

    The EGPRED server outputs a Colored GIF image that graphically displays the predicted protein coding regions obtained from different programs. The image is hyperlinked to the original results from these individual programs so that users can look at their results in tabular format.

  7. E-mail Reply Facility:

    This is optional. Users are provided the E-mail reply facility so that the EGPRED server intimates the user of the completion of their query and provides a link to the result page.

  8. Restrict Region for Analysis:

    Users can restrict the analysis of their sequence to a particular region (say 1010 to 3100 in a 4500 bp sequence). This helps in quick work as repeated cut and paste operations are avoided.

  9. FASTA FORMAT:

    The FASTA Format is a well known format to represent biomolecular sequences in an ordered way. It has a initial single sequence identifier line begining with a ">" tag. This line is a comment line and usually describes the sequence itself. Begining from the second line is the sequence itself till another ">" tag appears in the begining of any other line or the sequence ends.