Recognition of Ion Ligand Binding Sites Based on Amino Acid Features with the Fusion of Energy, Physicochemical and Structural Features

Page: [1093 - 1102] Pages: 10

  • * (Excluding Mailing and Handling)

Abstract

Background: Rational drug molecular design based on virtual screening requires the ligand binding site to be known. Recently, the recognition of ion ligand binding site has become an important research direction in pharmacology.

Methods: In this work, we selected the binding residues of 4 acid radical ion ligands (NO2 -, CO3 2-, SO4 2- and PO4 3-) and 10 metal ion ligands (Zn2+, Cu2+, Fe2+, Fe3+, Ca2+, Mg2+, Mn2+, Na+, K+ and Co2+) as research objects. Based on the protein sequence information, we extracted amino acid features, energy, physicochemical, and structure features. Then, we incorporated the above features and input them into the MultilayerPerceptron (MLP) and support vector machine (SVM) algorithms.

Results: In the independent test, the best accuracy was higher than 92.5%, which was better than the previous results on the same dataset. In addition, we found that energy information is an important factor affecting the prediction results.

Conclusion: Finally, we set up a free web server for the prediction of protein-ion ligand binding sites (http://39.104.77.103:8081/lsb/HomePage/HomePage.html). This study is helpful for molecular drug design.

Keywords: Energy feature, binding site, ion ligand, information entropy, protein-ion ligand binding, rational drug molecular design.

[1]
Fischer E. Einfluss der configuration auf die wirkung der enzyme. Ber Dtsch Chem Ges 1984; 27: 2985.
[http://dx.doi.org/10.1002/cber.18940270364]
[2]
Wang J, Dokholyan NV. MedusaDock 2.0: Efficient and Accurate Protein-Ligand Docking With Constraints. J Chem Inf Model 2019; 59(6): 2509-15.
[http://dx.doi.org/10.1021/acs.jcim.8b00905] [PMID: 30946779]
[3]
Mabonga L, Kappo AP. Protein-protein interaction modulators: advances, successes and remaining challenges. Biophys Rev 2019; 11(4): 559-81.
[http://dx.doi.org/10.1007/s12551-019-00570-x] [PMID: 31301019]
[4]
Pan XY, Fan YX, Jia J, Shen HB. Identifying RNA-binding proteins using multi-label deep learning. Science China (Information Sciences) 2019; 62(1)
[5]
Mezei M. On predicting foldability of a protein from its sequence. Proteins 2020; 88(2): 355-65.
[http://dx.doi.org/10.1002/prot.25811] [PMID: 31479556]
[6]
Wang S, Hu X, Feng Z, et al. Recognizing ion ligand binding sites by SMO algorithm. BMC Mol Cell Biol 2019; 20(Suppl. 3): 53.
[http://dx.doi.org/10.1186/s12860-019-0237-9] [PMID: 31823742]
[7]
Liu L, Hu X, Feng Z, et al. Prediction of acid radical ion binding residues by K-nearest neighbors classifier. BMC Mol Cell Biol 2019; 20(Suppl. 3): 52.
[http://dx.doi.org/10.1186/s12860-019-0238-8] [PMID: 31823720]
[8]
Reif DW. Ferritin as a source of iron for oxidative damage. Free Radic Biol Med 1992; 12(5): 417-27.
[http://dx.doi.org/10.1016/0891-5849(92)90091-T] [PMID: 1317328]
[9]
Monigatti F, Gasteiger E, Bairoch A, Jung E. The Sulfinator: predicting tyrosine sulfation sites in protein sequences. Bioinformatics 2002; 18(5): 769-70.
[http://dx.doi.org/10.1093/bioinformatics/18.5.769] [PMID: 12050077]
[10]
Lv X, Tan X. Metals Homeostasis and Related Proteins in Alzheimer’s Disease. Huaxue Jinzhan 2013; 25(4): 511-9.
[11]
Laurie AT, Jackson RM. Methods for the prediction of protein-ligand binding sites for structure-based drug design and virtual ligand screening. Curr Protein Pept Sci 2006; 7(5): 395-406.
[http://dx.doi.org/10.2174/138920306778559386] [PMID: 17073692]
[12]
Qiu Z, Wang X. Identification of ligand-binding pockets in proteins using residue preference methods. Protein Pept Lett 2009; 16(8): 984-90.
[http://dx.doi.org/10.2174/092986609788923284] [PMID: 19689426]
[13]
Myers S, Baker A. Drug discovery-an operating model for a new era. Nat Biotechnol 2001; 19(8): 727-30.
[http://dx.doi.org/10.1038/90765] [PMID: 11479559]
[14]
Li CH, Ma XH, Chen WZ, Wang CX. A protein-protein docking algorithm dependent on the type of complexes. Protein Eng 2003; 16(4): 265-9.
[http://dx.doi.org/10.1093/proeng/gzg035] [PMID: 12736369]
[15]
Sobolev V, Edelman M. Web tools for predicting metal binding sites in proteins. Isr J Chem 2013; 53(3-4): 166-72.
[http://dx.doi.org/10.1002/ijch.201200084]
[16]
Babor M, Gerzon S, Raveh B, Sobolev V, Edelman M. Prediction of transition metal-binding sites from apo protein structures. Proteins 2008; 70(1): 208-17.
[http://dx.doi.org/10.1002/prot.21587] [PMID: 17657805]
[17]
Lin CT, Lin KL, Yang CH, Chung IF, Huang CD, Yang YS. Protein metal binding residue prediction based on neural networks. Int J Neural Systems 2005; 15(01): 71-84.
[http://dx.doi.org/10.1142/S0129065705000116]
[18]
Lin HH, Han LY, Zhang HL, et al. Prediction of the functional class of metal-binding proteins from sequence derived physicochemical properties by support vector machine approach. BMC Bioinformatics 2006; 7(5)(Suppl. 5): S13.
[http://dx.doi.org/10.1186/1471-2105-7-S5-S13] [PMID: 17254297]
[19]
Horst JA, Samudrala R. A protein sequence meta-functional signature for calcium binding residue prediction. Pattern Recognit Lett 2010; 31(14): 2103-12.
[http://dx.doi.org/10.1016/j.patrec.2010.04.012] [PMID: 20824111]
[20]
Hu X, Dong Q, Yang J, Zhang Y. Recognizing metal and acid radical ion-binding sites by integrating ab initio modeling with template-based transferals. Bioinformatics 2016; 32(21): 3260-9.
[http://dx.doi.org/10.1093/bioinformatics/btw396] [PMID: 27378301]
[21]
Hu X, Wang K, Dong Q. Protein ligand-specific binding residue predictions by an ensemble classifier. BMC Bioinformatics 2016; 17(1): 470.
[http://dx.doi.org/10.1186/s12859-016-1348-3] [PMID: 27855637]
[22]
Li SB, Hu XZ, Sun LX, Zhang XJ. Identifying the Sulfate Ion Binding Residues in Proteins. International Conference on Biomedical & Biological Engineering
[http://dx.doi.org/10.2991/bbe-17.2017.34]
[23]
Cao X, Hu X, Zhang X, et al. Identification of metal ion binding sites based on amino acid sequences. PLoS One 2017; 12(8)e0183756
[http://dx.doi.org/10.1371/journal.pone.0183756] [PMID: 28854211]
[24]
Yang J, Roy A, Zhang Y. BioLiP: a semi-manually curated database for biologically relevant ligand-protein interactions. Nucleic Acids Res 2013; 41(Database issue): D1096-103.
[PMID: 23087378]
[25]
Gutman I, Zhou B. Laplacian energy of a graph. Linear Algebra Appl 2006; 414(1): 29-37.
[http://dx.doi.org/10.1016/j.laa.2005.09.008]
[26]
Das KC, Mojallal SA. On Laplacian energy of graphs. Discrete Math 2014; 325: 52-64.
[http://dx.doi.org/10.1016/j.disc.2014.02.017]
[27]
Wu HY, Zhang YS, Chen W, Mu ZC. Comparative analysis of protein primary sequences with graph energy. Physica A 2015; 437: 249-62.
[http://dx.doi.org/10.1016/j.physa.2015.04.017]
[28]
Taylor WR. The classification of amino acid conservation. J Theor Biol 1986; 119(2): 205-18.
[http://dx.doi.org/10.1016/S0022-5193(86)80075-3] [PMID: 3461222]
[29]
Pánek J, Eidhammer I, Aasland R. A new method for identification of protein (sub)families in a set of proteins based on hydropathy distribution in proteins. Proteins 2005; 58(4): 923-34.
[http://dx.doi.org/10.1002/prot.20356] [PMID: 15645428]
[30]
Motiejunas D, Wade RC. Structural, energetic, and dynamic aspects of ligand-receptor interactions Comprehensive medicinal chemistry 2007; 193-213.
[http://dx.doi.org/10.1016/B0-08-045044-X/00250-9]
[31]
Chen H, Zhou HX. Prediction of solvent accessibility and sites of deleterious mutations from protein sequence. Nucleic Acids Res 2005; 33(10): 3193-9.
[http://dx.doi.org/10.1093/nar/gki633] [PMID: 15937195]
[32]
Wu S, Zhang Y. ANGLOR: a composite machine-learning algorithm for protein backbone torsion angle prediction. PLoS One 2008; 3(10)e3400
[http://dx.doi.org/10.1371/journal.pone.0003400] [PMID: 18923703]
[33]
Feng ZX, Li QZ. Recognition of long-range enhancer-promoter interactions by adding genomic signatures of segmented regulatory regions. Genomics 2017; 109(5-6): 341-52.
[http://dx.doi.org/10.1016/j.ygeno.2017.05.009] [PMID: 28579514]
[34]
Hall M, Frank E, Holmes G, Pfahringer B. The WEKA data mining software: an update. SIGKDD Explor 2009; (11): 10-8.
[http://dx.doi.org/10.1145/1656274.1656278]
[35]
Cherkassky V. The nature of statistical learning theory. IEEE Trans Neural Netw 1997; 8(6): 1564-4.
[http://dx.doi.org/10.1109/TNN.1997.641482] [PMID: 18255760]
[36]
Chang CC, Lin CJ. LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2011; 2(3): 27.
[http://dx.doi.org/10.1145/1961189.1961199]
[37]
Kursa MB, Rudnicki WR. Feature Selection with the Boruta Package. J Stat Softw 2010; 36: 1-13.
[http://dx.doi.org/10.18637/jss.v036.i11]
[38]
Kursa MB, Jankowski A, Rudnicki WR. Boruta - A System for Feature Selection IOS Press 101 2010; 271-85.
[39]
Shannon CE. A mathematical theory of communication. Bell Syst Tech J 1948; 27: 623-56.
[http://dx.doi.org/10.1002/j.1538-7305.1948.tb00917.x]