Recent Development of Machine Learning Methods in Sumoylation Sites Prediction

Page: [894 - 907] Pages: 14

  • * (Excluding Mailing and Handling)

Abstract

Sumoylation of proteins is an important reversible post-translational modification of proteins and mediates a variety of cellular processes. Sumo-modified proteins can change their subcellular localization, activity, and stability. In addition, it also plays an important role in various cellular processes such as transcriptional regulation and signal transduction. The abnormal sumoylation is involved in many diseases, including neurodegeneration and immune-related diseases, as well as the development of cancer. Therefore, identification of the sumoylation site (SUMO site) is fundamental to understanding their molecular mechanisms and regulatory roles. In contrast to labor-intensive and costly experimental approaches, computational prediction of sumoylation sites in silico has also attracted much attention for its accuracy, convenience, and speed. At present, many computational prediction models have been used to identify SUMO sites, but their contents have not been comprehensively summarized and reviewed. Therefore, the research progress of relevant models is summarized and discussed in this paper. We have briefly summarized the development of bioinformatics methods for sumoylation site prediction by mainly focusing on the benchmark dataset construction, feature extraction, machine learning method, published results, and online tools. We hope that this review will provide more help for wet-experimental scholars.

Keywords: Sumo modification, feature selection, machine learning, classification, post-translational modification, sequential forward selection.

[1]
Geiss-Friedlander, R.; Melchior, F. Concepts in sumoylation: A decade on. Nat. Rev. Mol. Cell Biol., 2007, 8(12), 947-956.
[http://dx.doi.org/10.1038/nrm2293] [PMID: 18000527]
[2]
Huo, H.; Li, T.; Wang, S.; Lv, Y.; Zuo, Y.; Yang, L. Prediction of presynaptic and postsynaptic neurotoxins by combining various Chou’s pseudo components. Sci. Rep., 2017, 7(1), 5827.
[http://dx.doi.org/10.1038/s41598-017-06195-y] [PMID: 28724993]
[3]
Hasan, M.A.M.; Islam, M.K.B.; Julia Rahman, J.; Ahmad, S. Citrullination Site Prediction by Incorporating Sequence Coupled Effects into PseAAC and Resolving Data Imbalance Issue. Curr. Bioinform., 2020, 15(3), 235-245.
[http://dx.doi.org/10.2174/1574893614666191202152328]
[4]
Seeler, J.S.; Dejean, A. Nuclear and unclear functions of SUMO. Nat. Rev. Mol. Cell Biol., 2003, 4(9), 690-699.
[http://dx.doi.org/10.1038/nrm1200] [PMID: 14506472]
[5]
Steffan, J.S.; Agrawal, N.; Pallos, J.; Rockabrand, E.; Trotman, L.C.; Slepko, N.; Illes, K.; Lukacsovich, T.; Zhu, Y.Z.; Cattaneo, E.; Pandolfi, P.P.; Thompson, L.M.; Marsh, J.L. SUMO modification of Huntingtin and Huntington’s disease pathology. Science, 2004, 304(5667), 100-104.
[http://dx.doi.org/10.1126/science.1092194] [PMID: 15064418]
[6]
Princz, A.; Tavernarakis, N. SUMOylation in Neurodegenerative Diseases. Gerontology, 2020, 66(2), 122-130.
[http://dx.doi.org/10.1159/000502142] [PMID: 31505513]
[7]
Lee, L.; Sakurai, M.; Matsuzaki, S.; Arancio, O.; Fraser, P. SUMO and Alzheimer’s disease. Neuromolecular Med., 2013, 15(4), 720-736.
[http://dx.doi.org/10.1007/s12017-013-8257-7] [PMID: 23979993]
[8]
Liu, G.; Jin, S.; Hu, Y.; Jiang, Q. Disease status affects the association between rs4813620 and the expression of Alzheimer’s disease susceptibility gene TRIB3. Proc. Natl. Acad. Sci. USA, 2018, 115(45), E10519-E10520.
[http://dx.doi.org/10.1073/pnas.1812975115] [PMID: 30355771]
[9]
Liu, G.; Zhang, Y.; Wang, L.; Xu, J.; Chen, X.; Bao, Y.; Hu, Y.; Jin, S.; Tian, R.; Bai, W.; Zhou, W.; Wang, T.; Han, Z.; Zong, J.; Jiang, Q. Alzheimer’s Disease rs11767557 Variant Regulates EPHA1 Gene Expression Specifically in Human Whole Blood. J. Alzheimers Dis., 2018, 61(3), 1077-1088.
[http://dx.doi.org/10.3233/JAD-170468] [PMID: 29332039]
[10]
Dorval, V.; Fraser, P.E. Small ubiquitin-like modifier (SUMO) modification of natively unfolded proteins tau and alpha-synuclein. J. Biol. Chem., 2006, 281(15), 9919-9924.
[http://dx.doi.org/10.1074/jbc.M510127200] [PMID: 16464864]
[11]
Jiang, Q.; Liu, G. Lack of association between MC1R variants and Parkinson’s disease in European descent. Ann. Neurol., 2016, 79(5), 866-868.
[http://dx.doi.org/10.1002/ana.24627]
[12]
Yang, B.; Shen, J.; Xu, L.; Chen, Y.; Che, X.; Qu, X.; Liu, Y.; Teng, Y.; Li, Z. Genome-Wide Identification of a Novel Eight-lncRNA Signature to Improve Prognostic Prediction in Head and Neck Squamous Cell Carcinoma. Front. Oncol., 2019, 9, 898.
[http://dx.doi.org/10.3389/fonc.2019.00898] [PMID: 31620361]
[13]
Xue, Y. SUMOsp: A web server for sumoylation site prediction. Nucleic Acids Res, 2006, 34(Web Server issue), W254-W257.
[http://dx.doi.org/10.1093/nar/gkl207]
[14]
Xue, Y. GPS: A comprehensive www server for phosphorylation sites prediction. Nucleic Acids Res, 2005, 33(Web Server issue), W184-W187.
[http://dx.doi.org/10.1093/nar/gki393]
[15]
Schwartz, D.; Gygi, S.P. An iterative statistical approach to the identification of protein phosphorylation motifs from large-scale data sets. Nat. Biotechnol., 2005, 23(11), 1391-1398.
[http://dx.doi.org/10.1038/nbt1146] [PMID: 16273072]
[16]
Liu, B.; Li, S.; Wang, Y.; Lu, L.; Li, Y.; Cai, Y. Predicting the protein SUMO modification sites based on Properties Sequential Forward Selection (PSFS). Biochem. Biophys. Res. Commun., 2007, 358(1), 136-139.
[http://dx.doi.org/10.1016/j.bbrc.2007.04.097] [PMID: 17470363]
[17]
Xu, J.; He, Y.; Qiang, B.; Yuan, J.; Peng, X.; Pan, X.M. A novel method for high accuracy sumoylation site prediction from protein sequences. BMC Bioinformatics, 2008, 9, 8.
[http://dx.doi.org/10.1186/1471-2105-9-8] [PMID: 18179724]
[18]
Ren, J.; Gao, X.; Jin, C.; Zhu, M.; Wang, X.; Shaw, A.; Wen, L.; Yao, X.; Xue, Y. Systematic study of protein sumoylation: Development of a site-specific predictor of SUMOsp 2.0. Proteomics, 2009, 9(12), 3409-3412.
[http://dx.doi.org/10.1002/pmic.200800646] [PMID: 29658196]
[19]
Teng, S.; Luo, H.; Wang, L. Predicting protein sumoylation sites from sequence features. Amino Acids, 2012, 43(1), 447-455.
[http://dx.doi.org/10.1007/s00726-011-1100-2] [PMID: 21986959]
[20]
Chen, Y.Z.; Chen, Z.; Gong, Y.A.; Ying, G. SUMOhydro: A novel method for the prediction of sumoylation sites based on hydrophobic properties. PLoS One, 2012, 7(6), e39195.
[http://dx.doi.org/10.1371/journal.pone.0039195] [PMID: 22720073]
[21]
Yavuz, A.S.; Sezerman, O.U. Predicting sumoylation sites using support vector machines based on various sequence features, conformational flexibility and disorder. BMC Genomics, 2014, 15(Suppl. 9), S18.
[http://dx.doi.org/10.1186/1471-2164-15-S9-S18] [PMID: 25521314]
[22]
Macauley, M.S.; Errington, W.J.; Okon, M.; Schärpf, M.; Mackereth, C.D.; Schulman, B.A.; McIntosh, L.P. Structural and dynamic independence of isopeptide-linked RanGAP1 and SUMO-1. J. Biol. Chem., 2004, 279(47), 49131-49137.
[http://dx.doi.org/10.1074/jbc.M408705200] [PMID: 15355965]
[23]
Beauclair, G.; Bridier-Nahmias, A.; Zagury, J.F.; Saïb, A.; Zamborlini, A. JASSA: A comprehensive tool for prediction of SUMOylation sites and SIMs. Bioinformatics, 2015, 31(21), 3483-3491.
[http://dx.doi.org/10.1093/bioinformatics/btv403] [PMID: 26142185]
[24]
Sharma, A.; Lysenko, A.; López, Y.; Dehzangi, A.; Sharma, R.; Reddy, H.; Sattar, A.; Tsunoda, T. HseSUMO: Sumoylation site prediction using half-sphere exposures of amino acids residues. BMC Genomics, 2019, 19(Suppl. 9), 982.
[http://dx.doi.org/10.1186/s12864-018-5206-8] [PMID: 30999862]
[25]
Dehzangi, A.; López, Y.; Taherzadeh, G.; Sharma, A.; Tsunoda, T. SumSec: Accurate Prediction of Sumoylation Sites Using Predicted Secondary Structure. Molecules, 2018, 23(12), E3260.
[http://dx.doi.org/10.3390/molecules23123260] [PMID: 30544729]
[26]
Chen, Z.; Liu, X.; Li, F.; Li, C.; Marquez-Lago, T.; Leier, A.; Akutsu, T.; Webb, G.I.; Xu, D.; Smith, A.I.; Li, L.; Chou, K.C.; Song, J. Large-scale comparative assessment of computational predictors for lysine post-translational modification sites. Brief. Bioinform., 2019, 20(6), 2267-2290.
[http://dx.doi.org/10.1093/bib/bby089] [PMID: 30285084]
[27]
Zhang, T.; Tan, P.; Wang, L.; Jin, N.; Li, Y.; Zhang, L.; Yang, H.; Hu, Z.; Zhang, L.; Hu, C.; Li, C.; Qian, K.; Zhang, C.; Huang, Y.; Li, K.; Lin, H.; Wang, D. RNALocate: A resource for RNA subcellular localizations. Nucleic Acids Res., 2017, 45(D1), D135-D138.
[PMID: 27543076]
[28]
Liang, Z.Y.; Lai, H.Y.; Yang, H.; Zhang, C.J.; Yang, H.; Wei, H.H.; Chen, X.X.; Zhao, Y.W.; Su, Z.D.; Li, W.C.; Deng, E.Z.; Tang, H.; Chen, W.; Lin, H. Pro54DB: A database for experimentally verified sigma-54 promoters. Bioinformatics, 2017, 33(3), 467-469.
[PMID: 28171531]
[29]
Cheng, L.; Qi, C.; Zhuang, H.; Fu, T.; Zhang, X. gutMDisorder: A comprehensive database for dysbiosis of the gut microbiota in disorders and interventions. Nucleic Acids Res., 2020, 48(D1), D554-D560.
[http://dx.doi.org/10.1093/nar/gkz843] [PMID: 31584099]
[30]
Hu, B.; Zheng, L.; Long, C.; Song, M.; Li, T.; Yang, L.; Zuo, Y. EmExplorer: A database for exploring time activation of gene expression in mammalian embryos. Open Biol., 2019, 9(6), 190054.
[http://dx.doi.org/10.1098/rsob.190054] [PMID: 31164042]
[31]
Liu, B.; Gao, X.; Zhang, H. BioSeq-Analysis2.0: An updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches. Nucleic Acids Res., 2019, 47(20), e127.
[http://dx.doi.org/10.1093/nar/gkz740] [PMID: 31504851]
[32]
Liu, Z.; Wang, Y.; Gao, T.; Pan, Z.; Cheng, H.; Yang, Q.; Cheng, Z.; Guo, A.; Ren, J.; Xue, Y. CPLM: A database of protein lysine modifications. Nucleic Acids Res., 2014, 42(Database issue), D531-D536.
[http://dx.doi.org/10.1093/nar/gkt1093] [PMID: 24214993]
[33]
Bairoch, A.; Apweiler, R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res., 2000, 28(1), 45-48.
[http://dx.doi.org/10.1093/nar/28.1.45] [PMID: 10592178]
[34]
Li, W.; Godzik, A. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics, 2006, 22(13), 1658-1659.
[http://dx.doi.org/10.1093/bioinformatics/btl158] [PMID: 16731699]
[35]
Ahmed, M.S.; Shahjaman, M.; Kabir, E.; Kamruzzaman, M. Prediction of Protein Acetylation Sites using Kernel Naive Bayes Classifier Based on Protein Sequences Profiling. Bioinformation, 2018, 14(5), 213-218.
[http://dx.doi.org/10.6026/97320630014213] [PMID: 30108418]
[36]
Chang, C-C.; Tung, C.H.; Chen, C.W.; Tu, C.H.; Chu, Y.W. SUMOgo: Prediction of sumoylation sites on lysines by motif screening models and the effects of various post-translational modifications. Sci. Rep., 2018, 8(1), 15512.
[http://dx.doi.org/10.1038/s41598-018-33951-5] [PMID: 30341374]
[37]
Plewczynski, D.; Basu, S.; Saha, I. AMS 4.0: consensus prediction of post-translational modifications in protein sequences. Amino Acids, 2012, 43(2), 573-582.
[http://dx.doi.org/10.1007/s00726-012-1290-2] [PMID: 22555647]
[38]
Song, J.; Tan, H.; Shen, H.; Mahmood, K.; Boyd, S.E.; Webb, G.I.; Akutsu, T.; Whisstock, J.C. Cascleave: towards more accurate prediction of caspase substrate cleavage sites. Bioinformatics, 2010, 26(6), 752-760.
[http://dx.doi.org/10.1093/bioinformatics/btq043] [PMID: 20130033]
[39]
Song, J.; Tan, H.; Perry, A.J.; Akutsu, T.; Webb, G.I.; Whisstock, J.C.; Pike, R.N. PROSPER: An integrated feature-based tool for predicting protease substrate cleavage sites. PLoS One, 2012, 7(11), e50300.
[http://dx.doi.org/10.1371/journal.pone.0050300] [PMID: 23209700]
[40]
Song, J.; Burrage, K.; Yuan, Z.; Huber, T. Prediction of cis/trans isomerization in proteins using PSI-BLAST profiles and secondary structure information. BMC Bioinformatics, 2006, 7, 124.
[http://dx.doi.org/10.1186/1471-2105-7-124] [PMID: 16526956]
[41]
Song, J.; Wang, Y.; Li, F.; Akutsu, T.; Rawlings, N.D.; Webb, G.I.; Chou, K.C. iProt-Sub: A comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Brief. Bioinform., 2019, 20(2), 638-658.
[http://dx.doi.org/10.1093/bib/bby028] [PMID: 29897410]
[42]
Liu, B.; Zhu, Y.; Yan, K. Fold-LTR-TCP: protein fold recognition based on triadic closure principle. Brief. Bioinform., 2020, 21(6), 2185-2193.
[http://dx.doi.org/10.1093/bib/bbz139] [PMID: 31813954]
[43]
Shao, J.; Yan, K.; Liu, B. FoldRec-C2C: protein fold recognition by combining cluster-to-cluster model and protein similarity network. Brief. Bioinform., 2021, 22(3), bbaa144.
[http://dx.doi.org/10.1093/bib/bbaa144] [PMID: 32685972]
[44]
Kumar, M.; Gromiha, M.M.; Raghava, G.P. Prediction of RNA binding sites in a protein using SVM and PSSM profile. Proteins, 2008, 71(1), 189-194.
[http://dx.doi.org/10.1002/prot.21677] [PMID: 17932917]
[45]
Huang, G.H.; Li, J.C. Feature Extractions for Computationally Predicting Protein Post-Translational Modifications. Curr. Bioinform., 2018, 13(4), 387-395.
[http://dx.doi.org/10.2174/1574893612666170707094916]
[46]
Wang, T.; Yang, J. Predicting subcellular localization of gram-negative bacterial proteins by linear dimensionality reduction method. Protein Pept. Lett., 2010, 17(1), 32-37.
[http://dx.doi.org/10.2174/092986610789909494] [PMID: 19508203]
[47]
Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res., 1997, 25(17), 3389-3402.
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694]
[48]
Zheng, L.; Huang, S.; Mu, N.; Zhang, H.; Zhang, J.; Chang, Y.; Yang, L.; Zuo, Y. RAACBook: A web server of reduced amino acid alphabet for sequence-dependent inference by using Chou's five-step rule. Database (Oxford) 2019., 2019, baz131.
[http://dx.doi.org/10.1093/database/baz131]
[49]
Zheng, L.; Liu, D.; Yang, W.; Yang, L.; Zuo, Y. RaacLogo: A new sequence logo generator by using reduced amino acid clusters. Brief. Bioinform., 2021, 22(3), bbaa096.
[http://dx.doi.org/10.1093/bib/bbaa096] [PMID: 32524143]
[50]
Sandberg, M.; Eriksson, L.; Jonsson, J.; Sjöström, M.; Wold, S. New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. J. Med. Chem., 1998, 41(14), 2481-2491.
[http://dx.doi.org/10.1021/jm9700575] [PMID: 9651153]
[51]
Zhang, Z.Y.; Yang, Y.H.; Ding, H.; Wang, D.; Chen, W.; Lin, H. Design powerful predictor for mRNA subcellular location prediction in Homo sapiens. Brief. Bioinform., 2020, 22(1), 526-535.
[http://dx.doi.org/10.1093/bib/bbz177] [PMID: 31994694]
[52]
Yang, H. A comparison and assessment of computational method for identifying recombination hotspots in Saccharomyces cerevisiae. Brief. Bioinform., 2019.
[http://dx.doi.org/10.1093/bib/bbz123] [PMID: 31633777]
[53]
Yao, Y. Recent Progress in Long Noncoding RNAs Prediction. Curr. Bioinform., 2018, 13(4), 344-351.
[http://dx.doi.org/10.2174/1574893612666170905153933]
[54]
Liu, K.; Chen, W. iMRM: A platform for simultaneously identifying multiple kinds of RNA modifications. Bioinformatics, 2020, 36(11), 3336-3342.
[http://dx.doi.org/10.1093/bioinformatics/btaa155] [PMID: 32134472]
[55]
Liang, P.; Yang, W.; Chen, X.; Long, C.; Zheng, L.; Li, H.; Zuo, Y. Machine Learning of Single-Cell Transcriptome Highly Identifies mRNA Signature by Comparing F-Score Selection with DGE Analysis. Mol. Ther. Nucleic Acids, 2020, 20, 155-163.
[http://dx.doi.org/10.1016/j.omtn.2020.02.004] [PMID: 32169803]
[56]
Liu, B. BioSeq-Analysis: A platform for DNA, RNA and protein sequence analysis based on machine learning approaches. Brief. Bioinform., 2019, 20(4), 1280-1294.
[http://dx.doi.org/10.1093/bib/bbx165] [PMID: 29272359]
[57]
Tang, H. Identification of Secretory Proteins of Malaria Parasite by Feature Selection Technique. Lett. Org. Chem., 2017, 14(9), 621-624.
[http://dx.doi.org/10.2174/1570178614666170329155502]
[58]
Tang, H.; Yang, Y.; Zhang, C.; Chen, R.; Huang, P.; Duan, C.; Zou, P. Predicting Presynaptic and Postsynaptic Neurotoxins by Developing Feature Selection Technique. Biomed. Res. Int., 2017, 2017, 3267325.
[http://dx.doi.org/10.1155/2017/3267325]
[59]
Yu, L.S.Y.; Zou, Q.; Wang, S.; Zheng, L.; Gao, L. Exploring Drug Treatment Patterns Based on the Action of Drug and Multilayer Network Model. Int. J. Mol. Sci., 2020, 21(14), 5014.
[http://dx.doi.org/10.3390/ijms21145014]
[60]
Ao, C.; Jin, S.; Ding, H.; Zou, Q.; Yu, L. Application and Development of Artificial Intelligence and Intelligent Disease Diagnosis. Curr. Pharm. Des., 2020, 26(26), 3069-3075.
[http://dx.doi.org/10.2174/1381612826666200331091156] [PMID: 32228416]
[61]
Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell., 2005, 27(8), 1226-1238.
[http://dx.doi.org/10.1109/TPAMI.2005.159] [PMID: 16119262]
[62]
Dao, F.Y.; Lv, H.; Wang, F.; Feng, C.Q.; Ding, H.; Chen, W.; Lin, H. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics, 2019, 35(12), 2075-2083.
[http://dx.doi.org/10.1093/bioinformatics/bty943] [PMID: 30428009]
[63]
Wang, S.P. Analysis and Prediction of Nitrated Tyrosine Sites with the mRMR Method and Support Vector Machine Algorithm. Curr. Bioinform., 2018, 13(1), 3-13.
[http://dx.doi.org/10.2174/1574893611666160608075753]
[64]
Zuo, Y.; Li, Y.; Chen, Y.; Li, G.; Yan, Z.; Yang, L. PseKRAAC: A flexible web server for generating pseudo K-tuple reduced amino acids composition. Bioinformatics, 2017, 33(1), 122-124.
[http://dx.doi.org/10.1093/bioinformatics/btw564] [PMID: 27565583]
[65]
Zuo, Y.; Chang, Y.; Huang, S.; Zheng, L.; Yang, L.; Cao, G. iDEF-PseRAAC: Identifying the Defensin Peptide by Using Reduced Amino Acid Composition Descriptor. Evol. Bioinform. Online, 2019, 15, 1176934319867088.
[http://dx.doi.org/10.1177/1176934319867088] [PMID: 31391777]
[66]
Frank, E.; Hall, M.; Trigg, L.; Holmes, G.; Witten, I.H. Data mining in bioinformatics using Weka. Bioinformatics, 2004, 20(15), 2479-2481.
[http://dx.doi.org/10.1093/bioinformatics/bth261] [PMID: 15073010]
[67]
Xu, Z.C.; Feng, P.M.; Yang, H.; Qiu, W.R.; Chen, W.; Lin, H. iRNAD: A computational tool for identifying D modification sites in RNA sequence. Bioinformatics, 2019, 35(23), 4922-4929.
[http://dx.doi.org/10.1093/bioinformatics/btz358] [PMID: 31077296]
[68]
Tan, J.X.; Li, S.H.; Zhang, Z.M.; Chen, C.X.; Chen, W.; Tang, H.; Lin, H. Identification of hormone binding proteins based on machine learning methods. Math. Biosci. Eng., 2019, 16(4), 2466-2480.
[http://dx.doi.org/10.3934/mbe.2019123] [PMID: 31137222]
[69]
Lin, H. Identification of hormone binding proteins based on machine learning methods. Mathematical Biosciences and Engineering, 2019, 16(4), 2466-2480.
[http://dx.doi.org/10.1109/TCBB.2017.2666141]
[70]
Dao, F.Y.; Lv, H.; Yang, Y.H.; Zulfiqar, H.; Gao, H.; Lin, H. Computational identification of N6-methyladenosine sites in multiple tissues of mammals. Comput. Struct. Biotechnol. J., 2020, 18, 1084-1091.
[http://dx.doi.org/10.1016/j.csbj.2020.04.015] [PMID: 32435427]
[71]
Bu, H.D. Predicting Enhancers from Multiple Cell Lines and Tissues across Different Developmental Stages Based On SVM Method. Curr. Bioinform., 2018, 13(6), 655-660.
[http://dx.doi.org/10.2174/1574893613666180726163429]
[72]
Chen, W.; Feng, P.; Song, X.; Lv, H.; Lin, H. iRNA-m7G: Identifying N7-methylguanosine Sites by Fusing Multiple Features. Mol. Ther. Nucleic Acids, 2019, 18, 269-274.
[http://dx.doi.org/10.1016/j.omtn.2019.08.022] [PMID: 31581051]
[73]
Liu, B.; Li, K. iPromoter-2L2.0: identifying promoters and their types by combining Smoothing Cutting Window algorithm and sequence-based features. Mol. Ther. Nucleic Acids, 2019, 18, 80-87.
[http://dx.doi.org/10.1016/j.omtn.2019.08.008] [PMID: 31536883]
[74]
Boopathi, V.; Subramaniyam, S.; Malik, A.; Lee, G.; Manavalan, B.; Yang, D.C. mACPpred: A Support Vector Machine-Based Meta-Predictor for Identification of Anticancer Peptides. Int. J. Mol. Sci., 2019, 20(8), E1964.
[http://dx.doi.org/10.3390/ijms20081964] [PMID: 31013619]
[75]
Manavalan, B.; Basith, S.; Shin, T.H.; Wei, L.; Lee, G. Meta-4mCpred: A Sequence-Based Meta-Predictor for Accurate DNA 4mC Site Prediction Using Effective Feature Representation. Mol. Ther. Nucleic Acids, 2019, 16, 733-744.
[http://dx.doi.org/10.1016/j.omtn.2019.04.019] [PMID: 31146255]
[76]
Manavalan, B.; Lee, J. SVMQA: support-vector- machine-based protein single-model quality assessment. Bioinformatics, 2017, 33(16), 2496-2503.
[http://dx.doi.org/10.1093/bioinformatics/btx222] [PMID: 28419290]
[77]
Manavalan, B.; Shin, T.H.; Lee, G. PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine. Front. Microbiol., 2018, 9, 476.
[http://dx.doi.org/10.3389/fmicb.2018.00476] [PMID: 29616000]
[78]
Manavalan, B.; Shin, T.H.; Lee, G. DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest. Oncotarget, 2017, 9(2), 1944-1956.
[http://dx.doi.org/10.18632/oncotarget.23099] [PMID: 29416743]
[79]
Stephenson, N.; Shane, E.; Chase, J.; Rowland, J.; Ries, D.; Justice, N.; Zhang, J.; Chan, L.; Cao, R. Survey of Machine Learning Techniques in Drug Discovery. Curr. Drug Metab., 2019, 20(3), 185-193.
[http://dx.doi.org/10.2174/1389200219666180820112457] [PMID: 30124147]
[80]
Yu, L.; Xu, F.; Gao, L. Predict New Therapeutic Drugs for Hepatocellular Carcinoma Based on Gene Mutation and Expression. Front. Bioeng. Biotechnol., 2020, 8, 8.
[http://dx.doi.org/10.3389/fbioe.2020.00008] [PMID: 32047745]
[81]
Su, R.; Wu, H.; Xu, B.; Liu, X.; Wei, L. Developing a Multi-Dose Computational Model for Drug-induced Hepatotoxicity Prediction based on Toxicogenomics Data. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(4), 1231-1239.
[PMID: 30040651]
[82]
Wei, L.; Zhou, C.; Chen, H.; Song, J.; Su, R. ACPred-FL: A sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics, 2018, 34(23), 4007-4016.
[http://dx.doi.org/10.1093/bioinformatics/bty451] [PMID: 29868903]
[83]
Jiang, Q.; Wang, G.; Jin, S.; Li, Y.; Wang, Y. Predicting human microRNA-disease associations based on support vector machine. Int. J. Data Min. Bioinform., 2013, 8(3), 282-293.
[http://dx.doi.org/10.1504/IJDMB.2013.056078] [PMID: 24417022]
[84]
Zhu, Y.H.; Hu, J.; Qi, Y.; Song, X.N.; Yu, D.J. Boosting Granular Support Vector Machines for the Accurate Prediction of Protein-Nucleotide Binding Sites. Comb. Chem. High Throughput Screen., 2019, 22(7), 455-469.
[http://dx.doi.org/10.2174/1386207322666190925125524] [PMID: 31553288]
[85]
Hou, J.; Gao, H.; Xia, Q.; Qi, N. Feature Combination and the kNN Framework in Object Classification. IEEE Trans. Neural Netw. Learn. Syst., 2016, 27(6), 1368-1378.
[http://dx.doi.org/10.1109/TNNLS.2015.2461552] [PMID: 26316223]
[86]
Du, X.Q. Identification and Analysis of Cancer Diagnosis Using Probabilistic Classification Vector Machines with Feature Selection. Curr. Bioinform., 2018, 13(6), 625-632.
[http://dx.doi.org/10.2174/1574893612666170405125637]
[87]
Ozkan, A. Benchmarking Classification Models for Cell Viability on Novel Cancer Image Datasets. Curr. Bioinform., 2019, 14(2), 108-114.
[http://dx.doi.org/10.2174/1574893614666181120093740]
[88]
Dehzangi, A. A combination of feature extraction methods with an ensemble of different classifiers for protein structural class prediction problem. IEEE/ACM Trans Comput Biol Bioinform, 2013, 10(3), 564-575.
[http://dx.doi.org/10.1109/TCBB.2013.65]
[89]
Lv, H. iDNA-MS: An Integrated Computational Tool for Detecting DNA Modification Sites in Multiple Genomes. iScience, 2020, 23(4), 100991.
[90]
Zhao, X. Predicting Drug Side Effects with Compact Integration of Heterogeneous Networks. Curr. Bioinform., 2019, 14(8), 709-720.
[http://dx.doi.org/10.2174/1574893614666190220114644]
[91]
Cheng, L.; Zhao, H.; Wang, P.; Zhou, W.; Luo, M.; Li, T.; Han, J.; Liu, S.; Jiang, Q. Computational Methods for Identifying Similar Diseases. Mol. Ther. Nucleic Acids, 2019, 18, 590-604.
[http://dx.doi.org/10.1016/j.omtn.2019.09.019] [PMID: 31678735]
[92]
Cheng, L.; Hu, Y. Human Disease System Biology. Curr. Gene Ther., 2018, 18(5), 255-256.
[http://dx.doi.org/10.2174/1566523218666181010101114] [PMID: 30306867]
[93]
Manavalan, B.; Govindaraj, R.G.; Shin, T.H.; Kim, M.O.; Lee, G. iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction. Front. Immunol., 2018, 9, 1695.
[http://dx.doi.org/10.3389/fimmu.2018.01695] [PMID: 30100904]
[94]
Manavalan, B.; Lee, J.; Lee, J. Random forest-based protein model quality assessment (RFMQA) using structural features and potential energy terms. PLoS One, 2014, 9(9), e106542.
[http://dx.doi.org/10.1371/journal.pone.0106542] [PMID: 25222008]
[95]
Manavalan, B.; Shin, T.H.; Kim, M.O.; Lee, G. PIP-EL: A New Ensemble Learning Method for Improved Proinflammatory Peptide Predictions. Front. Immunol., 2018, 9, 1783.
[http://dx.doi.org/10.3389/fimmu.2018.01783] [PMID: 30108593]
[96]
Ao, C.; Zhou, W.; Gao, L.; Dong, B.; Yu, L. Prediction of antioxidant proteins using hybrid feature representation method and random forest. Genomics, 2020, 112(6), 4666-4674.
[http://dx.doi.org/10.1016/j.ygeno.2020.08.016] [PMID: 32818637]
[97]
Basith, S.; Manavalan, B.; Hwan Shin, T.; Lee, G. Machine intelligence in peptide therapeutics: A next-generation tool for rapid disease screening. Med. Res. Rev., 2020, 40(4), 1276-1314.
[http://dx.doi.org/10.1002/med.21658] [PMID: 31922268]
[98]
Basith, S.; Manavalan, B.; Shin, T.H.; Lee, G. iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree. Comput. Struct. Biotechnol. J., 2018, 16, 412-420.
[http://dx.doi.org/10.1016/j.csbj.2018.10.007] [PMID: 30425802]
[99]
Basith, S.; Manavalan, B.; Shin, T.H.; Lee, G. SDM6A: A Web-Based Integrative Machine-Learning Framework for Predicting 6mA Sites in the Rice Genome. Mol. Ther. Nucleic Acids, 2019, 18, 131-141.
[http://dx.doi.org/10.1016/j.omtn.2019.08.011] [PMID: 31542696]
[100]
Charoenkwan, P.; Kanthawong, S.; Nantasenamat, C.; Hasan, M.M.; Shoombuatong, W. iAMY-SCM: Improved prediction and analysis of amyloid proteins using a scoring card method with propensity scores of dipeptides. Genomics, 2021, 113(1 Pt 2), 689-698.
[http://dx.doi.org/10.1016/j.ygeno.2020.03.019] [PMID: 33017626]
[101]
Charoenkwan, P.; Kanthawong, S.; Nantasenamat, C.; Hasan, M.M.; Shoombuatong, W. iDPPIV-SCM: A sequence-based predictor for identifying and analyzing dipeptidyl peptidase IV (DPP-IV) inhibitory peptides using a scoring card method. J. Proteome Res., 2020, 19(10), 4125-4136.
[http://dx.doi.org/10.1021/acs.jproteome.0c00590] [PMID: 32897718]
[102]
Charoenkwan, P.; Kanthawong, S.; Schaduangrat, N.; Yana, J.; Shoombuatong, W. PVPred-SCM: Improved Prediction and Analysis of Phage Virion Proteins Using a Scoring Card Method. Cells, 2020, 9(2), 353.
[http://dx.doi.org/10.3390/cells9020353] [PMID: 32028709]
[103]
Charoenkwan, P.; Nantasenamat, C.; Hasan, M.M.; Shoombuatong, W. Meta-iPVP: A sequence-based meta-predictor for improving the prediction of phage virion proteins using effective feature representation. J. Comput. Aided Mol. Des., 2020, 34(10), 1105-1116.
[http://dx.doi.org/10.1007/s10822-020-00323-z] [PMID: 32557165]
[104]
Charoenkwan, P.; Shoombuatong, W.; Lee, H.C.; Chaijaruwanich, J.; Huang, H.L.; Ho, S.Y. SCMCRYS: predicting protein crystallization using an ensemble scoring card method with estimating propensity scores of P-collocated amino acid pairs. PLoS One, 2013, 8(9), e72368.
[http://dx.doi.org/10.1371/journal.pone.0072368] [PMID: 24019868]
[105]
Charoenkwan, P.; Yana, J.; Schaduangrat, N.; Nantasenamat, C.; Hasan, M.M.; Shoombuatong, W. iBitter-SCM: Identification and characterization of bitter peptides using a scoring card method with propensity scores of dipeptides. Genomics, 2020, 112(4), 2813-2822.
[http://dx.doi.org/10.1016/j.ygeno.2020.03.019] [PMID: 32234434]
[106]
Jin, S.; Zeng, X.; Xia, F.; Huang, W.; Liu, X. Application of deep learning methods in biological networks. Brief. Bioinform., 2021, 22(2), 1902-1917.
[http://dx.doi.org/10.1093/bib/bbaa043] [PMID: 32363401]
[107]
Zeng, X.; Zhu, S.; Lu, W.; Liu, Z.; Huang, J.; Zhou, Y.; Fang, J.; Huang, Y.; Guo, H.; Li, L.; Trapp, B.D.; Nussinov, R.; Eng, C.; Loscalzo, J.; Cheng, F. Target identification among known drugs by deep learning from heterogeneous networks. Chem. Sci. (Camb.), 2020, 11(7), 1775-1797.
[http://dx.doi.org/10.1039/C9SC04336E] [PMID: 34123272]
[108]
Yang, W. A brief survey of machine learning methods in protein sub-Golgi localization. Curr. Bioinform., 2019, 14, 234-240.
[http://dx.doi.org/10.2174/1574893613666181113131415]
[109]
Lai, H.Y.; Zhang, Z.Y.; Su, Z.D.; Su, W.; Ding, H.; Chen, W.; Lin, H. iProEP: A Computational Predictor for Predicting Promoter. Mol. Ther. Nucleic Acids, 2019, 17, 337-346.
[http://dx.doi.org/10.1016/j.omtn.2019.05.028] [PMID: 31299595]
[110]
Chen, W.; Feng, P.; Nie, F. iATP: A sequence based method for identifying anti-tubercular peptides. Med. Chem., 2020, 16(5), 620-625.
[http://dx.doi.org/10.2174/1573406415666191002152441] [PMID: 31339073]
[111]
Zhao, T.; Hu, Y.; Peng, J.; Cheng, L. DeepLGP: A novel deep learning method for prioritizing lncRNA target genes. Bioinformatics, 2020, 36(16), 4466-4472.
[http://dx.doi.org/10.1093/bioinformatics/btaa428] [PMID: 32467970]
[112]
Cheng, L. System Biology Methods and Tools for Pharmaceutical Design. Curr. Pharm. Des., 2020, 26(26), 3047-3048.
[http://dx.doi.org/10.2174/138161282626200714144530] [PMID: 32787750]
[113]
Hasan, M.M.; Manavalan, B.; Khatun, MS.; Kurata, H. Meta-i6mA: An interspecies predictor for identifying DNA N6-methyladenine sites of plant genomes by exploiting informative features in an integrative machine-learning framework. Brief. Bioinform., 2021, 22(3), bbaa202.
[http://dx.doi.org/10.1093/bib/bbaa202] [PMID: 32910169]
[114]
Hasan, M.M.; Manavalan, B.; Khatun, M.S.; Kurata, H. i4mC-ROSE, a bioinformatics tool for the identification of DNA N4-methylcytosine sites in the Rosaceae genome. Int. J. Biol. Macromol., 2019, 157, 752-758.
[http://dx.doi.org/10.1016/j.ijbiomac.2019.12.009] [PMID: 31805335]
[115]
Hasan, M.M.; Manavalan, B.; Shoombuatong, W.; Khatun, M.S.; Kurata, H. i6mA-Fuse: improved and robust prediction of DNA 6 mA sites in the Rosaceae genome by fusing multiple feature representation. Plant Mol. Biol., 2020, 103(1-2), 225-234.
[http://dx.doi.org/10.1007/s11103-020-00988-y] [PMID: 32140819]
[116]
Tang, H. A two-step discriminated method to identify thermophilic proteins. Int. J. Biomath., 2017, 10(4), 1750050.
[http://dx.doi.org/10.1142/S1793524517500504]
[117]
Yu, L.; Yao, S.; Gao, L.; Zha, Y. Conserved Disease Modules Extracted From Multilayer Heterogeneous Disease and Gene Networks for Understanding Disease Mechanisms and Predicting Disease Treatments. Front. Genet., 2019, 9, 745.
[http://dx.doi.org/10.3389/fgene.2018.00745] [PMID: 30713550]
[118]
Wang, T. Mobility based trust evaluation for heterogeneous electric vehicles network in smart cities. IEEE Trans. Intell. Transp. Syst., 2020, 22(3), 1797-1806.
[http://dx.doi.org/10.1109/TITS.2020.2997377]
[119]
Qiang, X.; Zhou, C.; Ye, X.; Du, P.F.; Su, R.; Wei, L. CPPred-FL: A sequence-based predictor for large-scale identification of cell-penetrating peptides by feature representation learning. Brief. Bioinform., 2018.
[http://dx.doi.org/10.1093/bib/bby091] [PMID: 30239616]
[120]
Wei, L.; Wan, S.; Guo, J.; Wong, K.K. A novel hierarchical selective ensemble classifier with bioinformatics application. Artif. Intell. Med., 2017, 83, 82-90.
[http://dx.doi.org/10.1016/j.artmed.2017.02.005] [PMID: 28245947]
[121]
Wei, L.; Xing, P.; Zeng, J.; Chen, J.; Su, R.; Guo, F. Improved prediction of protein-protein interactions using novel negative samples, features, and an ensemble classifier. Artif. Intell. Med., 2017, 83, 67-74.
[http://dx.doi.org/10.1016/j.artmed.2017.03.001] [PMID: 28320624]
[122]
Zhang, Z.M.; Tan, J.X.; Wang, F.; Dao, F.Y.; Zhang, Z.Y.; Lin, H. Early Diagnosis of Hepatocellular Carcinoma Using Machine Learning Method. Front. Bioeng. Biotechnol., 2020, 8, 254.
[http://dx.doi.org/10.3389/fbioe.2020.00254] [PMID: 32292778]
[123]
Feng, C.Q.; Zhang, Z.Y.; Zhu, X.J.; Lin, Y.; Chen, W.; Tang, H.; Lin, H. iTerm-PseKNC: A sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics, 2019, 35(9), 1469-1477.
[http://dx.doi.org/10.1093/bioinformatics/bty827] [PMID: 30247625]
[124]
Zhao, T.; Hu, Y.; Cheng, L. Deep-DRM: A computational method for identifying disease-related metabolites based on graph deep learning approaches. Brief. Bioinform., 2021, 22(4), 10.
[http://dx.doi.org/10.1093/bib/bbaa212] [PMID: 33048110]
[125]
Ijaz, A. SUMOhunt: Combining Spatial Staging between Lysine and SUMO with Random Forests to Predict SUMOylation. ISRN Bioinform., 2013, 2013, 671269.
[http://dx.doi.org/10.1155/2013/671269] [PMID: 25937950]
[126]
Hendriks, I.A.; D’Souza, R.C.; Yang, B.; Verlaan-de Vries, M.; Mann, M.; Vertegaal, A.C. Uncovering global SUMOylation signaling networks in a site-specific manner. Nat. Struct. Mol. Biol., 2014, 21(10), 927-936.
[http://dx.doi.org/10.1038/nsmb.2890] [PMID: 25218447]
[127]
Wang, D.; Zhang, Z.; Jiang, Y.; Mao, Z.; Wang, D.; Lin, H.; Xu, D. DM3Loc: multi-label mRNA subcellular localization prediction and analysis based on multi-head self-attention mechanism. Nucleic Acids Res., 2021, 49(8), e46.
[http://dx.doi.org/10.1093/nar/gkab016] [PMID: 33503258]
[128]
Lv, H.; Dao, F.Y.; Zulfiqar, H.; Lin, H. DeepIPs: comprehensive assessment and computational identification of phosphorylation sites of SARS-CoV-2 infection using a deep learning-based approach. Brief. Bioinform., 2021, 22(6), bbab244.
[PMID: 34184738]
[129]
Dao, F.Y. DeepYY1: A deep learning approach to identify YY1-mediated chromatin loops. Brief. Bioinform., 2021, 22(4), bbaa356.
[PMID: 33279983]
[130]
Lv, H. Deep-Kcr: Accurate detection of lysine crotonylation sites using deep learning method. Brief. Bioinform., 2021, 22(4), bbaa255.
[http://dx.doi.org/10.1093/bib/bbaa255] [PMID: 33099604]
[131]
Dao, F.Y.; Lv, H.; Su, W.; Sun, Z.J.; Huang, Q.L.; Lin, H. iDHS-Deep: An integrated tool for predicting DNase I hypersensitive sites by deep neural network. Brief. Bioinform., 2021, 22(5), bbab047.
[http://dx.doi.org/10.1093/bib/bbab047] [PMID: 33751027]
[132]
Matthew, C. AngularQA: protein model quality assessment with LSTM networks. Computational and Mathematical Biophysics, 2019, 7(1), 1-9.
[http://dx.doi.org/10.1515/cmb-2019-0001]
[133]
Cao, R.; Freitas, C.; Chan, L.; Sun, M.; Jiang, H.; Chen, Z. ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network. Molecules, 2017, 22(10), E1732.
[http://dx.doi.org/10.3390/molecules22101732] [PMID: 29039790]
[134]
Si, D.; Moritz, S.A.; Pfab, J.; Hou, J.; Cao, R.; Wang, L.; Wu, T.; Cheng, J. Deep Learning to Predict Protein Backbone Structure from High-Resolution Cryo-EM Density Maps. Sci. Rep., 2020, 10(1), 4282.
[http://dx.doi.org/10.1038/s41598-020-60598-y] [PMID: 32152330]
[135]
Hong, Z.; Zeng, X.; Wei, L.; Liu, X. Identifying enhancer-promoter interactions with neural network based on pre-trained DNA vectors and attention mechanism. Bioinformatics, 2020, 36(4), 1037-1043.
[PMID: 31588505]
[136]
Hong, Q.; Yan, R.; Wang, C.; Sun, J. Memristive Circuit Implementation of Biological Nonassociative Learning Mechanism and Its Applications. IEEE Trans. Biomed. Circuits Syst., 2020, 14(5), 1036-1050.
[http://dx.doi.org/10.1109/TBCAS.2020.3018777] [PMID: 32833643]
[137]
Song, B.; Zeng, X.; Jiang, M.; Perez-Jimenez, M.J. Monodirectional Tissue P Systems With Promoters. IEEE Trans. Cybern., 2021, 51(1), 438-450.
[http://dx.doi.org/10.1109/TCYB.2020.3003060] [PMID: 32649286]
[138]
Wei, L.; Tang, J.; Zou, Q. Local-DPP: An Improved DNA-binding Protein Prediction Method by Exploring Local Evolutionary Information. Inf. Sci., 2017, 384, 135-144.
[http://dx.doi.org/10.1016/j.ins.2016.06.026]
[139]
Wei, L.; Xing, P.; Shi, G.; Ji, Z.; Zou, Q. Fast prediction of methylation sites using sequence-based feature selection technique. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2019, 16(4), 1264-1273.
[PMID: 28222000]