The following steps should be kept in mind while using
Name of the protein sequence:
This is an optional field. User may or may not enter the name of the sequence.
Sequence can be submitted by two ways. User
can paste the sequence directly into the inbox field provided or upload the file by using the "BROWSE" option.Sequences must be entered in the
one-letter code. All the non standard characters will be ignored from the
Pslpred can accept both the
formatted or unformatted protein sequences. It uses ReadSeq routine to parse the input. The user should check the format of the input sequence before submittin
the prediction. The results of the prediction will be wrong if the format
choosen is wrong.
There are 5 types of approaches available for the prediction of secretory proteins.Users
have the option to choose either of the prediction approach available. The brief
account of all the approaches is given below:
Amino acid composition:
A SVM module developed on the basis of fraction of 20 types of amino acid presnt The calculation of amino acid composition generates the 20 dimensional input vectors for each protein sequence which were used SVM models . The composi
tion based SVM module has been
predicted with an overall accuracy of79.4.
Composition of physico-chemical
A SVM module developed on the basis of composition of 33 physico-chemical properties of the protein sequences. SVM module has been provided with an input vector of 33 dimensions for each sequence. The overall accuracy of properties
based SVM module is 77.4%.
Dipeptide composition :
The dipeptide composition based SVM module encompasses the information about amino acid composition along local order of amino acid.It uses the fixed pattern length of a vector with 400 dimensions. The SVM module has been predicted w
ith an overall accuracy of 79.9%, similiar to amino acid composition based SVM module.
Since homology of the protein with other related sequence also provides broad range of the evolutionary information, therefore we have also developed PSI-BLAST module to predict subcellular localization of prokaryotic proteins. The
performance of this module is poorer as compared to other modules developed in the present study. The SVM module based on this approach was able to predict the subcellular localization of the proteins with overall accuracy of 26%.
Hybrid based approach:
To enhance the prediction accuracy, we have devised methodologies to encapsulate more comprehensive information of a protein. A SVM-based module called as hybrid module was constructed on the basis of comprehensive information about
the proteins including amino acid composition, dipeptide composition, and PSI-BLAST results.This module uses an input vector of 423 dimensions.The hybrid module was able to achieve a striking accuracy of 83.2%.
The output shows the input data as submitted by the user along with the prediction results. It gives the name ( if provided), input sequence, length of the sequence and prediction approach as used by the users.