Protein & Peptide Letters

Author(s): Yudong Cai, Jianfeng He, Xinlei Li, Kaiyan Feng, Lin Lu, Kairui Feng, Xiangyin Kong and Wencong Lu

DOI: 10.2174/092986610790963654

DownloadDownload PDF Flyer Cite As
Prediction of Protein Subcellular Locations with Feature Selection and Analysis

Page: [464 - 472] Pages: 9

  • * (Excluding Mailing and Handling)

Abstract

In this paper, we propose a strategy to predict the subcellular locations of proteins by combining various feature selection methods. Firstly, proteins are coded by amino-acid composition and physicochemical properties, then these features are arranged by Minimum Redundancy Maximum Relevance method and further filtered by feature selection procedure. Nearest Neighbor Algorithm is used as a prediction model to predict the protein subcellular locations, and gains a correct prediction rate of 70.63%, evaluated by Jackknife cross-validation. Results of feature selection also enable us to identify the most important protein properties. The prediction software is available for public access on the website http://chemdata.shu.edu.cn/sub22/, which may play a important complementary role to a series of web-server predictors summarized recently in a review by Chou and Shen (Chou, K.C., Shen, H.B. Natural Science, 2009, 2, 63-92, http://www.scirp.org/journal/NS/).

Keywords: Subcellular location of proteins, Minimum Redundancy Maximum Relevance, Feature Selection, Nearest Neighbor Algorithm, Jackknife cross-validation test