Identification of Membrane Protein Types Based Using Hypergraph Neural Network

Page: [346 - 358] Pages: 13

  • * (Excluding Mailing and Handling)

Abstract

Introduction: Membrane proteins play an important role in living organisms as one of the main components of biological membranes. The problem in membrane protein classification and prediction is an important topic of membrane proteomics research because the function of proteins can be quickly determined if membrane protein types can be discriminated.

Methods: Most current methods to classify membrane proteins are labor-intensive and require a lot of resources. In this study, five methods, Average Block (AvBlock), Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT), Histogram of Orientation Gradient (HOG), and Pseudo-PSSM (PsePSSM), were used to extract features in order to predict membrane proteins on a large scale. Then, we combined the five obtained feature matrices and constructed the corresponding hypergraph association matrix. Finally, the feature matrices and hypergraph association matrices were integrated to identify the types of membrane proteins using a hypergraph neural network model (HGNN).

Results: The proposed method was tested on four membrane protein benchmark datasets to evaluate its performance. The results showed 92.8%, 88.6%, 88.2%, and 99.0% accuracy on each of the four datasets.

Conclusion: Compared to traditional machine learning classifier methods, such as Random Forest (RF), Support Vector Machine (SVM), etc. HGNN prediction performance was found to be better.

[1]
Chou KC, Elrod DW. Prediction of membrane protein types and subcellular locations. Proteins 1999; 34(1): 137-53.
[http://dx.doi.org/10.1002/(SICI)1097-0134(19990101)34:1<137:AID-PROT11>3.0.CO;2-O] [PMID: 10336379]
[2]
Cai YD, Zhou GP, Chou KC. Support vector machines for predicting membrane protein types by using functional domain composition. Biophys J 2003; 84(5): 3257-63.
[http://dx.doi.org/10.1016/S0006-3495(03)70050-2] [PMID: 12719255]
[3]
Cai YD, Chou KC. Predicting membrane protein type by functional domain composition and pseudo-amino acid composition. J Theor Biol 2006; 238(2): 395-400.
[http://dx.doi.org/10.1016/j.jtbi.2005.05.035] [PMID: 16040052]
[4]
Chou KC, Shen HB. MemType-2L: A Web server for predicting membrane proteins and their types by incorporating evolution infor-mation through Pse-PSSM. Biochem Biophys Res Commun 2007; 360(2): 339-45.
[http://dx.doi.org/10.1016/j.bbrc.2007.06.027] [PMID: 17586467]
[5]
Liu H, Yang J, Wang M, Xue L, Chou KC. Using fourier spectrum analysis and pseudo amino acid composition for prediction of mem-brane protein types. Protein J 2005; 24(6): 385-9.
[http://dx.doi.org/10.1007/s10930-005-7592-4] [PMID: 16323044]
[6]
Shen H, Chou KC. Using optimized evidence-theoretic K-nearest neighbor classifier and pseudo-amino acid composition to predict mem-brane protein types. Biochem Biophys Res Commun 2005; 334(1): 288-92.
[http://dx.doi.org/10.1016/j.bbrc.2005.06.087] [PMID: 16002049]
[7]
Shen HB, Yang J, Chou KC. Fuzzy KNN for predicting membrane protein types from pseudo-amino acid composition. J Theor Biol 2006; 240(1): 9-13.
[http://dx.doi.org/10.1016/j.jtbi.2005.08.016] [PMID: 16197963]
[8]
Wang M, Yang J, Liu GP, Xu ZJ, Chou KC. Weighted-support vector machines for predicting membrane protein types based on pseudo-amino acid composition. Protein Eng Des Sel 2004; 17(6): 509-16.
[http://dx.doi.org/10.1093/protein/gzh061] [PMID: 15314209]
[9]
Liu H, Wang M, Chou KC. Low-frequency Fourier spectrum for predicting membrane protein types. Biochem Biophys Res Commun 2005; 336(3): 737-9.
[http://dx.doi.org/10.1016/j.bbrc.2005.08.160] [PMID: 16140260]
[10]
Wang SQ, Yang J, Chou KC. Using stacked generalization to predict membrane protein types based on pseudo-amino acid composition. J Theor Biol 2006; 242(4): 941-6.
[http://dx.doi.org/10.1016/j.jtbi.2006.05.006] [PMID: 16806277]
[11]
Chen YK, Li KB. Predicting membrane protein types by incorporating protein topology, domains, signal peptides, and physicochemical properties into the general form of Chou’s pseudo amino acid composition. J Theor Biol 2013; 318: 1-12.
[http://dx.doi.org/10.1016/j.jtbi.2012.10.033] [PMID: 23137835]
[12]
Han GS, Yu ZG, Anh V. A two-stage SVM method to predict membrane protein types by incorporating amino acid classifications and physicochemical properties into a general form of Chou’s PseAAC. J Theor Biol 2014; 344: 31-9.
[http://dx.doi.org/10.1016/j.jtbi.2013.11.017] [PMID: 24316387]
[13]
Hayat M, Khan A. Predicting membrane protein types by fusing composite protein sequence features into pseudo amino acid composition. J Theor Biol 2011; 271(1): 10-7.
[http://dx.doi.org/10.1016/j.jtbi.2010.11.017] [PMID: 21110985]
[14]
Hayat M, Khan A, Yeasin M. Prediction of membrane proteins using split amino acid and ensemble classification. Amino Acids 2012; 42(6): 2447-60.
[http://dx.doi.org/10.1007/s00726-011-1053-5] [PMID: 21850437]
[15]
Rezaei MA, Abdolmaleki P, Karami Z, et al. Prediction of membrane protein types by means of wavelet analysis and cascaded neural networks. J Theor Biol 2008; 254(4): 817-20.
[http://dx.doi.org/10.1016/j.jtbi.2008.07.012] [PMID: 18692511]
[16]
Shen Y, Tang J, Guo F. Identification of protein subcellular localization via integrating evolutionary and physicochemical information into Chou’s general PseAAC. J Theor Biol 2019; 462: 230-9.
[http://dx.doi.org/10.1016/j.jtbi.2018.11.012] [PMID: 30452958]
[17]
Wang Y, Ding Y, Guo F, Wei L, Tang J. Improved detection of DNA-binding proteins via compression technology on PSSM information. PLoS One 2017; 12(9)e0185587
[http://dx.doi.org/10.1371/journal.pone.0185587] [PMID: 28961273]
[18]
Shen C, Ding Y, Tang J, Xu X, Guo F. An ameliorated prediction of drug–target interactions based on multi-scale discrete wavelet trans-form and network features. Int J Mol Sci 2017; 18(8): 1781.
[http://dx.doi.org/10.3390/ijms18081781] [PMID: 28813000]
[19]
Ahmed N, Natarajan T, Rao KR. Discrete cosine transform. IEEE Trans Comput 1974; C-23(1): 90-3.
[http://dx.doi.org/10.1109/T-C.1974.223784]
[20]
Ding Y, Tang J, Guo F. Identification of protein–protein interactions via a novel matrix-based sequence representation model with amino acid contact information. Int J Mol Sci 2016; 17(10): 1623.
[http://dx.doi.org/10.3390/ijms17101623] [PMID: 27669239]
[21]
Boeckmann B, Bairoch A, Apweiler R, et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 2003; 31(1): 365-70.
[http://dx.doi.org/10.1093/nar/gkg095] [PMID: 12520024]
[22]
Li W, Godzik A. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformat 2006; 22(13): 1658-9.
[http://dx.doi.org/10.1093/bioinformatics/btl158] [PMID: 16731699]
[23]
Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: Accelerated for clustering the next-generation sequencing data. Bioinformat 2012; 28(23): 3150-2.
[http://dx.doi.org/10.1093/bioinformatics/bts565] [PMID: 23060610]
[24]
Xiaotong L, Xue-Wen C, Jeong JC, Lin X, Chen XW. On position-specific scoring matrix for protein function prediction. IEEE/ACM Trans Comput Biol Bioinformat 2011; 8(2): 308-15.
[http://dx.doi.org/10.1109/TCBB.2010.93]
[25]
Nanni L, Brahnam S, Lumini A. Wavelet images and Chou’s pseudo amino acid composition for protein classification. Amino Acids 2012; 43(2): 657-65.
[http://dx.doi.org/10.1007/s00726-011-1114-9] [PMID: 21993538]
[26]
Zhou D, Huang J, Schölkopf B. Learning with hypergraphs: Clustering, classification, and embedding. Adv Neural Inf Process Syst. 2006; 19: pp. 1601-8.
[http://dx.doi.org/10.5555/2976456.2976657]
[27]
Huang Y, Liu Q, Metaxas D. Video object segmentation by hypergraph cut. 2009 IEEE conference on computer vision and pattern recognition. 2009; pp. 1738-45.
[http://dx.doi.org/10.1109/CVPR.2009.5206795]
[28]
Huang Y, Liu Q, Zhang S, Metaxas DN. Image retrieval via probabilistic hypergraph ranking. In 2010 IEEE computer society conference on computer vision and pattern recognition 2010; 3376-83.
[http://dx.doi.org/10.1109/CVPR.2010.5540012]
[29]
Yue G, Meng W, Zheng-Jun Z, Jialie S, Xuelong L, Xindong W. Visual-textual joint relevance learning for tag-based social image search. IEEE Trans Image Process 2013; 22(1): 363-76.
[http://dx.doi.org/10.1109/TIP.2012.2202676] [PMID: 22692911]
[30]
Hwang T, Tian Z, Kuangy R, Kocher JP. Learning on weighted hypergraphs to integrate protein interactions and gene expressions for cancer outcome prediction. 2008 8th IEEE International Conference on Data Mining 2008; 293-302.
[http://dx.doi.org/10.1109/ICDM.2008.37]
[31]
Gao Y, Wang M, Tao D, Ji R, Dai Q. 3-D object retrieval and recognition with hypergraph analysis. IEEE Trans Image Process 2012; 21(9): 4290-303.
[http://dx.doi.org/10.1109/TIP.2012.2199502] [PMID: 22614650]
[32]
Feng Y, You H, Zhang Z, Ji R, Gao Y. Hypergraph neural networks. Proc Conf AAAI Artif Intell 2019; 33(1): 3558-65.
[http://dx.doi.org/10.1609/aaai.v33i01.33013558]
[33]
Henaff M, Bruna J, LeCun Y. Deep convolutional networks on graph-structured data. arXiv 2015; abs/1506.05163.
[http://dx.doi.org/10.48550/arXiv.1506.05163]
[34]
Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering.Adv Neural Inf Process Syst 2016; 29: pp. 3844-52.
[http://dx.doi.org/10.5555/3157382.3157527]
[35]
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 2014; 15(1): 1929-58.
[36]
Kingma DP, Ba J. A method for stochastic optimization. arXiv 2014; 1412-6980.
[37]
Alhamdoosh M, Wang D. Fast decorrelated neural network ensembles with random weights. Inf Sci 2014; 264: 104-17.
[http://dx.doi.org/10.1016/j.ins.2013.12.016]
[38]
Chou KC. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 2001; 43(3): 246-55.
[http://dx.doi.org/10.1002/prot.1035] [PMID: 11288174]
[39]
Wang L, Yuan Z, Chen X, Zhou Z. The prediction of membrane protein types with NPE. IEICE Elect Exp 2010; 7(6): 397-402.
[http://dx.doi.org/10.1587/elex.7.397]
[40]
Shen HB, Chou KC. Using ensemble classifier to identify membrane protein types. Amino Acids 2007; 32(4): 483-8.
[http://dx.doi.org/10.1007/s00726-006-0439-2] [PMID: 17031474]