Protein & Peptide Letters

Author(s): Yu-Fang Qin, Chun-Hua Wang, Xiao-Qing Yu, Jie Zhu, Tai-Gang Liu and Xiao-Qi Zheng

DOI: 10.2174/092986612799789350

DownloadDownload PDF Flyer Cite As
Predicting Protein Structural Class by Incorporating Patterns of Over- Represented k-mers into the General form of Chou’s PseAAC

Page: [388 - 397] Pages: 10

  • * (Excluding Mailing and Handling)

Abstract

Computational prediction of protein structural class based on sequence data remains a challenging problem in current protein science. In this paper, a new feature extraction approach based on relative polypeptide composition is introduced. This approach could take into account the background distribution of a given k-mer under a Markov model of order k-2, and avoid the curse of dimensionality with the increase of k by using a T-statistic feature selection strategy. The selected features are then fed to a support vector machine to perform the prediction. To verify the performance of our method, jackknife cross-validation tests are performed on four widely used benchmark datasets. Comparison of our results with existing methods shows that our method provides satisfactory performance for structural class prediction.

Keywords: Markov model, protein structural class, relative polypeptide composition, support vector machine, T-statistic