Background: As the pathogen of malaria, malaria parasite secretes a variety of proteins for its growth and reproduction.
Objective: The identification of the secretory proteins of malaria parasite has crucial reference significance for the development of anti-malaria vaccines as well as medicine.
Methods: In this study, a computational classification method was developed to identify the secreted proteins of Plasmodium. Amino acid composition, dipeptide composition, and tripeptide composition as well as reduced amino acids alphabets were proposed to illuminate protein sequences. We further used SVM to train and predict respectively and optimized the features.
Results: 74 types of reduced amino acids alphabets were employed to predict secretory proteins. The results showed that the accuracy improved to 91.67% with 0.84 Mathew’s correlation coefficient (MCC) by dipeptide composition, and the highest prediction accuracy reached 92.26% after feature selection, which demonstrated that our method is prominent and reliable in the field of malaria parasite secreted proteins prediction.
Conclusion: A intuitive web server iSP-RAAC (http://bioinfor.imu.edu.cn/isppseraac) was established for the convenience of most experimental scientists.
Keywords: Secretory proteins, reduced amino acids alphabets, dipeptide composition, prediction, malaria parasite, antimalarial vaccines.