Combining Sequence Entropy and Subgraph Topology for Complex Prediction in Protein Protein Interaction (PPI) Network

Page: [516 - 523] Pages: 8

  • * (Excluding Mailing and Handling)

Abstract

Background: Complex prediction from interaction network of proteins has become a challenging task. Most of the computational approaches focus on topological structures of protein complexes and fewer of them consider important biological information contained within amino acid sequences.

Objective: To capture the essence of information contained within protein sequences we have computed sequence entropy and length. Proteins interact with each other and form different sub graph topologies.

Methods: We integrate biological features with sub graph topological features and model complexes by using a Logistic Model Tree.

Results: The experimental results demonstrated that our method out performs other four state-ofart computational methods in terms of the number of detecting known protein complexes correctly.

Conclusion: In addition, our framework provides insights into future biological study and might be helpful in predicting other types of sub graph topologies.

Keywords: Protein Protein Interaction (PPI), sequence entropy, sub graph topology, biological features, logistic model tree, cluster.

Graphical Abstract

[1]
Qi Y, Balem F, Faloutsos C, Klein-Seetharaman J, Bar-Joseph Z. Protein complex identification by supervised graph local clustering. Bioinformatics 2008; 24(13): i250-8.
[2]
Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci USA 2001; 98(8): 4569-74.
[3]
Uetz P, Giot L, Cagney G, et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 2000; 403(6770): 623-7.
[4]
Rual JF, Venkatesan K, Hao T, et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature 2005; 437(7062): 1173-8.
[5]
Stelzl U, Worm U, Lalowski M, et al. A human protein-protein interaction network: A resource for annotating the proteome. Cell 2005; 122(6): 957-68.
[6]
von Mering C, Krause R, Snel B, et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature 2002; 417(6887): 399-403.
[7]
Bader GD, Hogue CW. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 2003; (4): 2.
[8]
Adamcsek B, Palla G, Farkas IJ, Derényi I, Vicsek T. CFinder: Locating cliques and overlapping modules in biological networks. Bioinformatics 2006; 22(8): 1021-3.
[9]
Van Dongen S. Graph clustering by flow simulation University fo Utrech 2008.
[10]
Altaf-Ul-Amin M, Shinbo Y, Mihara K, Kurokawa K, Kanaya S. Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinformatics 2006; 7: 207.
[11]
Wang JX, Li M, Chen JE, Hu B, Chen G. Modifying the DPClus algorithm for identifying protein complexes based on new topological structures. BMC Bioinformatics 2008; (9): 398.
[12]
King AD, Przulj N, Jurisica I. Protein complex prediction via cost-based clustering. Bioinformatics 2004; 20(17): 3013-20.
[13]
Leung HC, Xiang Q, Yiu SM, Chin FY. Predicting protein complexes from PPI data: A core-attachment approach. J Comput Biol 2009; 16(2): 133-44.
[14]
Wu M, Li XL, Kwoh CK, Ng SK. A core-attachment based method to detect protein complexes in PPI networks. BMC Bioinformatics 2009; (10): 169.
[15]
Tang X, Wang J, Li M, He Y. A Novel Algorithm for Detecting Protein Complexes with the Breadth First Search. BioMed Res Int 2014; 2014354539
[16]
Yu Y, et al. Complex detection based on integrated properties 2011 121-8.
[17]
Zeng J, et al. An empirical study of features fusion techniques for protein-protein interaction prediction. Curr Bioinform 2016; 11(1): 4-12.
[18]
Zou Q, Zeng J, Cao L, et al. A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing 2016; 173: 346-54.
[19]
Yu Y, Liu J, Feng N, Song B, Zheng Z. Combining sequence and Gene Ontology for protein module detection in the Weighted Network. J Theor Biol 2017; 412: 107-12.
[20]
Wan S, Duan Y, Zou Q. HPSLPred: An Ensemble Multi-Label Classifier for Human Protein Subcellular Location Prediction with Imbalanced Source. Proteomics 2017; 17(17-18)1700262
[21]
Wei L, Ding Y, Su R, Tang J, Zou Q. Prediction of human protein subcellular localization using deep learning. J Parallel Distrib Comput 2018; 117: 212-7.
[22]
Sikandar A, et al. Decision Tree Based Approaches for Detecting Protein Complex in Protein Protein Interaction Network (PPI) via Link and Sequence Analysis. IEEE Access 2018; 6: 22108-20.
[23]
Landwehr N, Hall M, Frank E. Logistic Model Trees. Mach Learn 2005; 95(1-2): 161-205.
[24]
Mewes HW, Dietmann S, Frishman D, et al. MIPS: Analysis and annotation of genome information in 2007. Nucleic Acids Res 2008; 36(Database issue): D196-201.
[25]
Pu S, Wong J, Turner B, Cho E, Wodak SJ. Up-to-date catalogues of yeast protein complexes. Nucleic Acids Res 2009; 37(3): 825-31.
[26]
Xenarios I, Salwínski L, Duan XJ, Higney P, Kim SM, Eisenberg D. DIP, the Database of Interacting Proteins: A research tool for studying cellular networks of protein interactions. Nucleic Acids Res 2002; 30(1): 303-5.
[27]
Saccharomyces Genome Database (SGD). Available from Downloads. https://downloadsyeastgenome org/sequence/S288C_reference/ orf_protein/ (Accessed on Dec 1, 2013).