The Golgi apparatus is an important eukaryotic organelle. Successful prediction of Golgi protein types can provide valuable information for elucidating protein functions involved in various biological processes. In this work, a method is proposed by combining a special mode of pseudo amino acid composition (increment of diversity) with the modified Mahalanobis discriminant for predicting Golgi protein types. The benchmark dataset used to train the predictor thus formed contains 95 Golgi proteins in which none of proteins included has ≥ 40% pairwise sequence identity to any other. The accuracy obtained by the jackknife test was 74.7%, with the ROC curve of 0.772 in identifying cis-Golgi proteins and trans-Golgi proteins. Subsequently, the method was extended to discriminate cis-Golgi network proteins from cis-Golgi network membrane proteins and trans-Golgi network proteins from trans-Golgi network membrane proteins, respectively. The accuracies thus obtained were 76.1% and 83.7%, respectively. These results indicate that our method may become a useful tool in the relevant areas. As a user-friendly web-server, the predictor is freely accessible at http://immunet.cn/SubGolgi/.
Keywords: cis-Golgi proteins, trans-golgi proteins, shannon entropy, pseudo-amino acid composition, increment of diversity, modified mahalanobis discriminant, eukaryotic organelle, sub-Golgi apparatus, Alzheimer's disease, Golgi protein types, single-location Golgi proteins, CD-HIT program, PseAAC, g-gap, positive predictive value, 0-gap dipeptide, 2-gap dipeptide, IDMD algorithm, Scientific Research Foundation of UESTC (JX0769)cis-Golgi proteins, trans-golgi proteins, shannon entropy, pseudo-amino acid composition, increment of diversity, modified mahalanobis discriminant, eukaryotic organelle, sub-Golgi apparatus, Alzheimer's disease, Golgi protein types, single-location Golgi proteins, CD-HIT program, PseAAC, g-gap, positive predictive value, 0-gap dipeptide, 2-gap dipeptide, IDMD algorithm, Scientific Research Foundation of UESTC (JX0769)