Application of Machine Learning Approaches for the Design and Study of Anticancer Drugs

Yan       Hu; Yi       Lu; Shuo       Wang; Mengying       Zhang; Xiaosheng       Qu; Bing       Niu
Abstract

Background: Globally the number of cancer patients and deaths are continuing to increase yearly, and cancer has, therefore, become one of the world's highest causes of morbidity and mortality. In recent years, the study of anticancer drugs has become one of the most popular medical topics.
Objective: In this review, in order to study the application of machine learning in predicting anticancer drugs activity, some machine learning approaches such as Linear Discriminant Analysis (LDA), Principal components analysis (PCA), Support Vector Machine (SVM), Random forest (RF), k-Nearest Neighbor (kNN), and Naïve Bayes (NB) were selected, and the examples of their applications in anticancer drugs design are listed.
Results: Machine learning contributes a lot to anticancer drugs design and helps researchers by saving time and is cost effective. However, it can only be an assisting tool for drug design.
Conclusion: This paper introduces the application of machine learning approaches in anticancer drug design. Many examples of success in identification and prediction in the area of anticancer drugs activity prediction are discussed, and the anticancer drugs research is still in active progress. Moreover, the merits of some web servers related to anticancer drugs are mentioned.
Keywords: Machine learning (ML), anticancer drugs, linear discriminant analysis (LDA), principal components analysis (PCA), support vector machine (SVM), random forest (RF), k-nearest neighbor (kNN), naïve bayes (NB), deep learning, web servers.
Graphical Abstract

[1] 
Some Studies in Machine Learning Using the Game of Checkers, IBM J Res Develop  1959; 210-29.
[2] 
Koza JR. Automated design of both the topology and sizing of analog electrical circuits using genetic programming. Artificial Intelligence in Design’96  1996; 151-70.
[3] 
Ron K, Foster P. Glossary of Terms. Mach Learn  1998; 271-4.
[4] 
Dong Z, Zhang N, Li C, et al. Anticancer drug sensitivity prediction in cell lines from baseline gene expression through recursive feature selection. BMC Cancer  2015; 15: 489.
[5] 
Chou KC, Jones D, Heinrikson RL. Prediction of the tertiary structure and substrate binding site of caspase-8. FEBS Lett  1997; 419: 49-54.
[6] 
Chou KC, Wei DQ, Zhong WZ. Binding mechanism of coronavirus main proteinase with ligands and its implication to drug design against SARS. Biochem Biophys Res Commun  2003; 308: 148-51.
[7] 
Chou KC, Tomasselli AG, Heinrikson RL. Prediction of the tertiary structure of a caspase-9/inhibitor complex. FEBS Lett  2000; 470: 249-56.
[8] 
Li XB, Wang SQ, Xu WR, et al. Novel inhibitor design for hemagglutinin against H1N1 influenza virus by core hopping method. PLoS One  2011; 6(11): e28111.
[9] 
Liao QH, Gao QZ, Wei J, et al. Docking and molecular dynamics study on the inhibitory activity of novel inhibitors on epidermal growth factor receptor (EGFR). Med Chem  2011; 7: 24-31.
[10] 
Ma Y, Wang SQ, Xu WR, et al. Design novel dual agonists for treating type-2 diabetes by targeting peroxisome proliferator-activated receptors with core hopping approach. PLoS One  2012; 7(6): e38546.
[11] 
Wang JF, Chou KC. Insights from modeling the 3d structure of new delhi metallo-beta-lactamse and its binding interactions with antibiotic drugs. PLoS One  2011; 6(4): e18414.
[12] 
Wang JF, Chou KC. Insights into the Mutation-Induced HHH syndrome from modeling human mitochondrial ornithine transporter-1. PLoS One  2012; 7(1): e31048.
[13] 
Chou KC. Insights from modeling three-dimensional structures of the human potassium and sodium channels. J Proteome Res  2004; 3: 856-61.
[14] 
Chou KC. Insights from modelling the 3D structure of the extracellular domain of alpha 7 nicotinic acetylcholine receptor. Biochem Biophys Res Commun  2004; 319: 433-8.
[15] 
Chou KC. Insights from modeling the tertiary structure of human BACE2. J Proteome Res  2004; 3: 1069-72.
[16] 
Chou KC. Coupling interaction between thromboxane A2 receptor and alpha-13 subunit of guanine nucleotide-binding protein. J Proteome Res  2005; 4: 1681-6.
[17] 
Chou KC. Insights from modeling the 3D structure of DNA-CBF3b complex. J Proteome Res  2005; 4: 1657-60.
[18] 
Wang JF, Chou KC. Insights from studying the mutation-induced allostery in the M2 proton channel by molecular dynamics. Protein Eng Des Sel  2010; 23: 663-6.
[19] 
Wang JF, Wei DQ, Lin Y, et al. Insights from modeling the 3D structure of NAD(P)H-dependent D-Xylose reductase of Pichia stipitis and its binding interactions with NAD and NADP. Biochem Biophys Res Commun  2007; 359: 323-9.
[20] 
Wang SQ, Du QS, Huang RB, et al. Insights from investigating the interaction of oseltamivir (Tamiflu) with neuraminidase of the 2009 H1N1 swine flu virus. Biochem Biophys Res Commun  2009; 386: 432-6.
[21] 
Chou KC. Structural bioinformatics and its impact to biomedical science. Curr Med Chem  2004; 11: 2105-34.
[22] 
Fan YN, Xiao X, Min JL, et al. iNR-Drug: predicting the interaction
of drugs with nuclear receptors in cellular networking.In J
Mol Sci
 In:  2014; 15: pp. 4915-37.
[23] 
Min JL, Xiao X, Chou KC. iEzy-Drug: A web server for identifying the interaction between enzymes and drugs in cellular networking. BioMed Res Int  2013; 2013: 701317.
[24] 
Xiao X, Min JL, Lin WZ, et al. iDrug-Target: predicting the interactions between drug compounds and target proteins in cellular networking via benchmark dataset optimization approach. J Biomol Struct Dyn  2015; 33: 2221-33.
[25] 
Xiao X, Min JL, Wang P, et al. iGPCR-Drug: A web server for predicting interaction between GPCRs and drugs in cellular networking. PLoS One  2013; 8.
[26] 
Xiao X, Min JL, Wang P, et al. iCDI-PseFpt: Identify the channel-drug interaction in cellular networking with PseAAC and molecular fingerprints. J Theor Biol  2013; 337: 71-9.
[27] 
Chen W, Feng PM, Ding H, et al. iRNA-Methyl: Identifying N-6-methyladenosine sites using pseudo nucleotide composition. Anal Biochem  2015; 490: 26-33.
[28] 
Chou KC. Impacts of bioinformatics to medicinal chemistry. Med Chem  2015; 11: 218-34.
[29] 
Jia CZ, Lin X, Wang ZP. Prediction of protein s-nitrosylation sites based on adapted normal distribution bi-profile bayes and chou’s pseudo amino acid composition. Int J Mol Sci  2014; 15: 10410-23.
[30] 
Qiu WR, Xiao X, Lin WZ, et al. iMethyl-PseAAC: Identification of protein methylation sites via a pseudo amino acid composition approach. BioMed Res Int  2014; 2014: 947416.
[31] 
Qiu WR, Xiao X, Lin WZ, et al. iUbiq-Lys: prediction of lysine ubiquitination sites in proteins by extracting sequence evolution information via a gray system model. J Biomol Struct Dyn  2015; 33: 1731-42.
[32] 
Xie HL, Fu L, Nie XD. Using ensemble SVM to identify human GPCRs N-linked glycosylation sites based on the general form of Chous PseAAC. Protein Eng Des Sel  2013; 26: 735-42.
[33] 
Xu Y, Ding J, Wu LY, et al. iSNO-PseAAC: Predict cysteine s-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition. PLoS One  2013; 8.
[34] 
Xu Y, Shao XJ, Wu LY, et al. iSNO-AAPair: incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins. PeerJ  2013; 1.
[35] 
Xu Y, Wen X, Shao XJ, et al. iHyd-PseAAC: Predicting hydroxyproline and hydroxylysine in proteins by incorporating dipeptide position-specific propensity into pseudo amino acid composition. Int J Mol Sci  2014; 15: 7594-610.
[36] 
Xu Y, Wen X, Wen LS, et al. iNitro-Tyr: Prediction of nitrotyrosine sites in proteins with general pseudo amino acid composition. PLoS One  2014; 9.
[37] 
Zhang J, Zhao XW, Sun PP, et al. PSNO: Predicting cysteine s-nitrosylation sites by incorporating various sequence-derived features into the general form of chou’s PseAAC. Int J Mol Sci  2014; 15: 11204-19.
[38] 
Chen W, Feng P, Yang H, et al. iRNA-3typeA: identifying 3-types of modification at RNA’s adenosine sites. Mol Ther Nucleic Acids  2018; 11: 468-74.
[39] 
Chou KC. An unprecedented revolution in medicinal chemistry driven by the progress of biological science. Curr Top Med Chem  2017; 17: 2337-58.
[40] 
Feng P, Yang H, Ding H, et al. iDNA6mA-PseKNC: Identifying DNA N 6 -methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics 2018.
[41] 
Feng PM, Ding H, Yang H, et al. iRNA-PseColl: Identifying the Occurrence Sites of Different RNA Modifications by Incorporating Collective Effects of Nucleotides into PseKNC. Mol Ther Nucleic Acids  2017; 7: 155-63.
[42] 
Jia JH, Liu Z, Xiao X, et al. iSuc-PseOpt: Identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset. Anal Biochem  2016; 497: 48-56.
[43] 
Jia JH, Liu Z, Xiao X, et al. pSuc-Lys: Predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach. J Theor Biol  2016; 394: 223-30.
[44] 
Jia JH, Liu Z, Xiao X, et al. iCar-PseCp: identify carbonylation sites in proteins by Monto Carlo sampling and incorporating sequence coupled effects into general PseAAC. Oncotarget  2016; 7: 34558-70.
[45] 
Jia JH, Zhang LX, Liu Z, et al. pSumo-CD: predicting sumoylation sites in proteins with covariance discriminant algorithm by incorporating sequence-coupled effects into general PseAAC. Bioinformatics  2016; 32: 3133-41.
[46] 
Lee K, Jung SY, Hwang H, et al. A novel concept for integrating and delivering health information using a comprehensive digital dashboard: An analysis of healthcare professionals’ intention to adopt a new system and the trend of its real usage. Int J Med Inform  2017; 97: 98-108.
[47] 
Ju Z, Wang SY. Prediction of citrullination sites by incorporating k-spaced amino acid pairs into Chou’s general pseudo amino acid composition. Gene 2018.
[48] 
Khan YD, Rasool N, Hussain W, et al. iPhosT-PseAAC: Identify phosphothreonine sites by incorporating sequence statistical moments into PseAAC. Anal Biochem  2018; 550: 109-16.
[49] 
Liu LM, Xu Y, Chou KC. iPGK-PseAAC: Identify lysine phosphoglycerylation sites in proteins by incorporating four different tiers of amino acid pairwise coupling information into the general PseAAC. Med Chem  2017; 13: 552-9.
[50] 
Liu Z, Xiao X, Yu DJ, et al. pRNAm-PC: Predicting N-6-methyladenosine sites in RNA sequences via physical-chemical properties. Anal Biochem  2016; 497: 60-7.
[51] 
Qiu WR, Jiang SY, Sun BQ, et al. iRNA-2methyl: Identify RNA 2′-O-methylation Sites by Incorporating Sequence-Coupled Effects into General PseKNC and Ensemble Classifier. Med Chem  2017; 13: 734-43.
[52] 
Qiu WR, Jiang SY, Xu ZC, et al. iRNAm5C-PseDNC: identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition. Oncotarget  2017; 8: 41178-88.
[53] 
Qiu WR, Sun BQ, Xiao X, et al. iPhos-PseEvo: Identifying Human Phosphorylated Proteins by Incorporating Evolutionary Information into General PseAAC via Grey System Theory. Mol Inform  2017; 36.
[54] 
Qiu WR, Sun BQ, Xiao X, et al. iHyd-PseCp: Identify hydroxyproline and hydroxylysine in proteins by incorporating sequence-coupled effects into general PseAAC. Oncotarget  2016; 7: 44310-21.
[55] 
Qiu WR, Sun BQ, Xiao X, et al. iPTM-mLys: identifying multiple lysine PTM sites and their different types. Bioinformatics  2016; 32: 3116-23.
[56] 
Qiu WR, Sun BQ, Xuan X, et al. iKcr-PseEns: Identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier. Genomics 2017.
[57] 
Qiu WR, Xiao X, Xu ZC, et al. iPhos-PseEn: Identifying phosphorylation sites in proteins by fusing different pseudo components into an ensemble classifier. Oncotarget  2016; 7: 51270-83.
[58] 
Sabooh MF, Iqbal N, Khan M, et al. Identifying 5-methylcytosine sites in RNA sequence using composite encoding feature into Chou’s PseKNC. J Theor Biol  2018; 7(452): 1-9.
[59] 
Xu Y, Chou KC. Recent progress in predicting posttranslational modification sites in proteins. Curr Top Med Chem  2016; 16: 591-603.
[60] 
Xu Y, Wang Z, Li CH, et al. iPreny-PseAAC: Identify C-terminal Cysteine Prenylation Sites in Proteins by Incorporating Two Tiers of Sequence Couplings into PseAAC. Med Chem  2017; 13: 544-51.
[61] 
Chou KC. Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol  2011; 273: 236-47.
[62] 
Chen W, Lin H, Chou KC. Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences. Mol Biosyst  2015; 11: 2620-34.
[63] 
Cheng X, Xiao X, Chou KC. pLoc-mPlant: predict subcellular localization of multi-location plant proteins by incorporating the optimal GO information into general PseAAC. Mol Biosyst  2017; 13: 1722-7.
[64] 
Cheng X, Xiao X, Chou KC. pLoc-mVirus: Predict subcellular localization of multi-location virus proteins via incorporating the optimal GO information into general PseAAC. Gene  2017; 628: 315-21.
[65] 
Cheng X, Zhao SG, Lin WZ, et al. pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites. Bioinformatics  2017; 33: 3524-31.
[66] 
Cheng X, Zhao SG, Xiao X, et al. iATC-mISF: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals. Bioinformatics  2017; 33: 341-6.
[67] 
Cheng X, Xiao X, Chou KC. pLoc-mGneg: Predict subcellular localization of Gram-negative bacterial proteins by deep gene ontology learning via general PseAAC. Genomics 2017.
[68] 
Cheng X, Xiao X, Chou KC. pLoc-mEuk: Predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC. Genomics  2018; 110: 50-8.
[69] 
Cheng X, Xiao X, Chou KC. pLoc-mHum: predict subcellular localization of multi-location human proteins via general PseAAC to winnow out the crucial GO information. Bioinformatics  2018; 34: 1448-56.
[70] 
Xiao X, Cheng X, Chen G, et al. pLoc-mGpos: Predict subcellular localization of Gram-positive bacterial proteins by quasi-balancing training dataset and PseAAC. Genomics 2018.
[71] 
Xiao X, Cheng X, Su S, et al. pLoc-mGpos: Incorporate Key Gene Ontology Information into General PseAAC for Predicting Subcellular Localization of Gram-Positive Bacterial Proteins. Nat Sci  2017; 09: 330-49.
[72] 
Chou KC. Some remarks on predicting multi-label attributes in molecular biosystems. Mol Biosyst  2013; 9: 1092-100.
[73] 
Chen W, Feng PM, Lin H, et al. iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition. Nucleic Acids Res  2013; 41.
[74] 
Chen W, Feng PM, Yang H, et al. iRNA-AI: identifying the adenosine to inosine editing sites in RNA sequences. Oncotarget  2017; 8: 4208-17.
[75] 
Cheng X, Zhao SG, Xiao X, et al. iATC-mHyb: a hybrid multi-label classifier for predicting the classification of anatomical therapeutic chemicals. Oncotarget  2017; 8: 58494-503.
[76] 
Du QS, Wang SQ, Xie NZ, et al. 2L-PCA: a two-level principal component analyzer for quantitative drug design and its applications. Oncotarget  2017; 8: 70564-78.
[77] 
Lin H, Deng EZ, Ding H, et al. iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition. Nucleic Acids Res  2014; 42: 12961-72.
[78] 
Liu B, Liu FL, Wang XL, et al. Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nucleic Acids Res  2015; 43: W65-71.
[79] 
Liu B, Wang SY, Long R, et al. iRSpot-EL: identify recombination spots with an ensemble learning approach. Bioinformatics  2017; 33: 35-41.
[80] 
Liu B, Wu H, Zhang DY, et al. Pse-Analysis: a python package for DNA/RNA and protein/peptide sequence analysis based on pseudo components and kernel methods. Oncotarget  2017; 8: 13338-43.
[81] 
Liu B, Yang F, Chou KC. 2L-piRNA: A Two-Layer Ensemble Classifier for Identifying Piwi-Interacting RNAs and Their Function. Mol Ther Nucleic Acids  2017; 7: 267-77.
[82] 
Niu B, Zhang MY, Du P, et al. Small molecular floribundiquinone B derived from medicinal plants inhibits acetylcholinesterase activity. Oncotarget  2017; 8: 57149-62.
[83] 
Qiu WR, Xiao X, Chou KC. iRSpot-TNCPseAAC: Identify recombination spots with trinucleotide composition and pseudo amino acid components. Int J Mol Sci  2014; 15: 1746-66.
[84] 
Su Q, Lu WC, Du DS, et al. Prediction of the aquatic toxicity of aromatic compounds to tetrahymena pyriformis through support vector regression. Oncotarget  2017; 8: 49359-69.
[85] 
Wang JW, Yang BJ, Revote J, et al. POSSUM: a bioinformatics toolkit for generating numerical sequence feature descriptors based on PSSM profiles. Bioinformatics  2017; 33: 2756-8.
[86] 
Xu ZC, Qiu WR, Xiao X. iRSpotH-TNCPseAAC: Identifying recombination spots in human by using pseudo trinucleotide composition with an ensemble of support vector machine classifiers. Lett Org Chem  2017; 14: 703-13.
[87] 
Zhang ZD, Liang K, Li K, et al. Chlorella vulgaris Induces Apoptosis of Human Non-Small Cell Lung Carcinoma (NSCLC) Cells. Med Chem  2017; 13: 560-8.
[88] 
Kotsiantis SB. Supervised machine learning: a review of classification techniques Informatica.  (Ljubl) 2007; pp. 249-68.
[89] 
Rennie J, Shih L, Teevan J, et al. Tackling the poor assumptions of naive bayes text Classifiers 2003.
[90] 
Duda RO, Hart PE, Stork DG. Pattern Classification, ch. 10: Unsupervised
learning and clustering 2001.
[91] 
Kotsiantis S, Pintelas P. Recent advances in clustering: A brief survey. Wseas Transactions Inform Sci  2004; 1(1): 73-81.
[92] 
Laskaris R. Artificial intelligence: A modern approach, 3rd edition. In: Lib J.   2015; 140: pp. 45-5.
[93] 
Mehryar Mohri AR, Talwalkar A. Machine learning; Computer
algorithms. 2012.
[94] 
Gould KA. The elements of statistical learning (2nd edition): Data
mining, inference, and prediction. In: Dimensions Critical Care Nursing.   2016; 35: pp. 52-2.
[95] 
Bousquet O, Luxburg U, Rätsch G. Advanced lectures on machine learning. Springer 2003.
[96] 
Lewis R. Chapter 4: The development of molecular modelling
programs: the use and limitations of physical models. 
[97] 
rajamani R. Good A. Ranking poses in structure-based lead discovery
and optimization: current trends in scoring function development. Curr Opin Drug Dis Develop  2007; 308-15.
[98] 
Hughes JP, Rees S, Kalindjian SB, et al. Principles of early drug discovery. British J Pharmacol  2011; 162: 1239-49.
[99] 
F.R.S. Rafsd. The use of multiple measurements in taxonomic problems. Ann Human Genetics  1936; 7(2): 179-88.
[100] 
Gohulkumar M, Kumar P, Murali Krishna C, et al. Evaluation of Raman spectroscopy for prediction of antitumor response to silibinin and its nanoparticulates in DMBA-induced oral carcinogenesis. J Raman Spectroscopy  2016; 47: 375-83.
[101] 
Adhikaria1 N, Haldera1 AK, Sahab A, et al. Structural findings of
phenylindoles as cytotoxic antimitotic agents in human breast cancer
cell lines through multiple validated QSAR studies. Toxicol In
Vitro 2015: 1392-404. 2015; 1392-404.
[102] 
Maldonado-Rojas W, Olivero-Verbel J, Marrero-Ponce Y. Computational fishing of new DNA methyltransferase inhibitors from natural products. J Mol Graph Model  2015; 43-54.
[103] 
Goel PN, Singh SP, Murali Krishna C, et al. Investigating the effects of Pentoxifylline on human breast cancer cells using Raman spectroscopy. J Innov Opt Health Sci  2015; 08: 1550004.
[104] 
Covell DG. Integrating constitutive gene expression and chemoactivity: Mining the NCI60 Anticancer Screen. PLoS One  2012; 7.
[105] 
F.R.S KP LIII. On lines and planes of closest fit to systems of points in space. Philos Mag  1901; 2(11): 559-72.
[106] 
Paguigan ND, Al-Huniti MH, Raja HA, et al. Chemoselective fluorination and chemoinformatic analysis of griseofulvin: Natural vs fluorinated fungal metabolites. Bioorg Med Chem  2017; 25: 5238-46.
[107] 
Taguchi YH, Wang HY. Genetic association between amyotrophic lateral sclerosis and cancer. Genes (Basel)  2017; 8.
[108] 
Wang Z, Liu JQ, Xu JD, et al. UPLC/ESI-QTOF-MS-based metabolomics survey on the toxicity of triptolide and detoxication of licorice. Chin J Nat Med  2017; 15: 474-80.
[109] 
Su J, Liu X, Zhang S, et al. A computational insight into binding modes of inhibitors XD29, XD35, and XD28 to bromodomain-containing protein 4 based on molecular dynamics simulations. J Biomol Struct Dyn  2018; 36(5): 1212-24.
[110] 
Chen JZ. Clarifying binding difference of ATP and ADP to extracellular signal-regulated kinase 2 by using molecular dynamics simulations. Chem Biol Drug Des  2017; 89: 548-58.
[111] 
Demir O, Ieong PU, Amaro RE. Full-length p53 tetramer bound to DNA and its quaternary dynamics. Oncogene  2017; 36: 1451-60.
[112] 
Shafique S, Rashid S. Antiviral drug acyclovir exhibits antitumor activity via targeting beta TrCP1: Molecular docking and dynamics simulation study. J Mol Graph Model  2017; 72: 96-105.
[113] 
Yao YR, Zhang P, Wang J, et al. Dissecting target toxic tissue and tissue specific responses of irinotecan in rats using metabolomics approach. Front Pharmacol  2017; 8.
[114] 
Wali VB, Langdon CG, Held MA, et al. Systematic drug screening identifies tractable targeted combination therapies in triple-negative breast cancer. Cancer Res  2017; 77: 566-78.
[115] 
Altman NS. An introduction to Kernel and nearest-neighbor nonparametric regression. The American Statistician  1992; 175.
[116] 
Amin SA, Adhikari N, Agrawal RK, et al. Possible binding mode analysis of pyrazolo-triazole hybrids as potential anticancer agents through validated molecular docking and 3d-qsar modeling approaches.  Lett Drug Des Dis 2017; pp. 515-27.
[117] 
Bhandari DSV. 2D, 3D, G-QSAR and Docking Studies of Thiazolyl-
Pyrazoline Analogues as Potent (Epidermal Growth Factor
Receptor-Tyrosine Kinase) EGFRTK Inhibitors. Lett Drug Des Dis  2017; 14.
[118] 
Aboalhaija NH, Zihlif MA, Taha MO. Discovery of new selective cytotoxic agents against Bcl-2 expressing cancer cells using ligand-based modeling. Chem Biol Interact  2016; 250: 12-6.
[119] 
AlQudah DA, Zihlif MA, Taha MO. Ligand-based modeling of diverse aryalkylamines yields new potent P-glycoprotein inhibitors. Eur J Med Chem  2016; 110: 204-23.
[120] 
Vapnik VN. An overview of statistical learning theory. IEEE Trans Neural Netw  1999; 988-9.
[121] 
Cortes C, Vapnik V. Support-vector networks. Mach Learn  1995; 273-97.
[122] 
A P, P S, B Ż, et al. Novel 2-(2-alkylthiobenzenesulfonyl)-3-(phenylprop-2-ynylideneamino)guanidine derivatives as potent anticancer agents - Synthesis, molecular structure, QSAR studies and metabolic stability, Eur J Med Chem  2017; 357-70.
[123] 
Li FM, Wang XQ. Identifying anticancer peptides by using improved hybrid compositions. Sci Rep  2016; 6.
[124] 
Singh H, Kumar R, Singh S, et al. Prediction of anticancer molecules using hybrid model developed on molecules screened against NCI-60 cancer cell lines. BMC Cancer  2016; 16.
[125] 
Dong ZL, Zhang NQ, Li C, et al. Anticancer drug sensitivity prediction in cell lines from baseline gene expression through recursive feature selection. BMC Cancer  2015; 15.
[126] 
Zhang M, Su Q, Lu Y, et al. Application of machine learning approaches for protein-protein interactions prediction. Med Chem  2017; 13: 506-14.
[127] 
Zhang P, Chen JQ, Huang WQ, et al. Renal medulla is more sensitive to cisplatin than cortex revealed by untargeted mass spectrometry-based metabolomics in rats. Sci Rep  2017; 7.
[128] 
Dhiman K, Agarwal SM. NPred: QSAR classification model for identifying plant based naturally occurring anti-cancerous inhibitors. RSC Advances  2016; 6: 49395-400.
[129] 
Wangabd L, Liabd Y, Xuc M, et al. Chemical fragment-based CDK4/6 inhibitors prediction and web server.  RSC Adv 2016; pp. 16972-81.
[130] 
Hand DJ, Yu KM. Idiot’s Bayes-Not so stupid after all?. Int Stat Rev  2001; 69: 385-98.
[131] 
Rish I. An empirical study of the naive Bayes classifier 2001.
[132] 
Rennie JDM, Shih L, Teevan J, et al. Tackling the Poor Assumptions of Naive Bayes Text Classifiers 2003.
[133] 
Krishna S, Shukla S, Lakra AD, et al. Identification of potent inhibitors of DNA methyltransferase 1 (DNMT1) through a pharmacophore-based virtual screening approach. J Mol Graph Model  2017; 75: 174-88.
[134] 
Tran WT, Gangeh MJ, Sannachi L, et al. Predicting breast cancer response to neoadjuvant chemotherapy using pretreatment diffuse optical spectroscopic texture analysis. Br J Cancer  2017; 116: 1329-39.
[135] 
Liu Z, He W, Gao J, et al. Computational prediction and experimental validation of a novel synthesized pan-PIM inhibitor PI003 and its apoptosis-inducing mechanisms in cervical cancer. Oncotarget  2015; 6: 8019-35.
[136] 
Yin J-Y, Li X, Li X-P, et al. Prediction models for platinum-based chemotherapy response and toxicity in advanced NSCLC patients. Cancer Lett  2016; 377: 65-73.
[137] 
Zhang H, Cao Z-X, Li M, et al. Novel naive Bayes classification models for predicting the carcinogenicity of chemicals. Food And Chem Toxicol  2016; 97: 141-9.
[138] 
Ali S, Majid A. Can-Evo-Ens: Classifier stacking based evolutionary ensemble system for prediction of human breast cancer using amino acid sequences. J Biomed Informatics  2015; 54: 256-69.
[139] 
Amirkhah R, Farazmand A, Gupta SK, et al. Naive Bayes classifier predicts functional microRNA target interactions in colorectal cancer. Mol Biosyst  2015; 11: 2126-34.
[140] 
Yang R, Zhang C, Gao R, et al. A machine learning approach to identify dna replication proteins from sequence-derived features. 2015 Ieee 28th Canadian Conf Electrical Comput Engin  2015; 13-8.
[141] 
Begum S, Chakraborty D, Sarkar R, et al. Identifying cancer biomarkers from leukemia data using feature selection and supervised learning 2016.
[142] 
Bengio Y, Courville A, Vincent P. Representation Learning: A review and new perspectives. IEEE Trans Pattern Anal Mach Intell  2013; 35: 1798-828.
[143] 
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature  2015; 521: 436-44.
[144] 
Schmidhuber J. Deep learning in neural networks: An overview. Neural Network  2015; 61: 85-17.
[145] 
Ghasemi F, Mehridehnavi AR, Fassihi A, et al. Deep neural network in biological activity prediction using deep belief network. Appl Soft Comput  2017; 62.
[146] 
Schmidhuber J. Multi-column deep neural networks for image classification. In: Computer Vision and Pattern Recognition.   2012; pp. 3642-9.
[147] 
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. In: International Conference
on Neural Information Processing Systems  2012; 1097-105.
[148] 
Cao RZ, Bhattacharya D, Hou J, et al. Deep QA: improving the estimation of single protein model quality with deep belief networks. BMC Bioinformatics  2016; 17.
[149] 
Cao RZ, Freitas C, Chan L, et al. ProLanGO: Protein function
prediction using neural machine translation based on a recurrent
Neural Network. Mol  2017; 22.
[150] 
van Gerven M, Bohte S. Editorial: artificial neural networks as models of neural information processing. Front Comput Neurosci  2017; 11.
[151] 
Abadi RSK, Alizadehdakhel A, Moosapour F. Linear and non-linear QSAR models on platinum (II) anticancer drugs with N-donor ligands. Ind J Chem Section B-Org Chem Including Med Chem  2017; 56: 677-86.
[152] 
Abadi RSK, Alizadehdakhel A, Shiraz SD. Ab initio and QSAR study of several etoposides as anticancer drugs: Solvent effect. Russian J Phys Chem  2017; 11: 307-17.
[153] 
Amin SA, Adhikari N, Gayen S, et al. First report on the structural exploration and prediction of new BPTES analogs as glutaminase inhibitors. J Mol Struct  2017; 1143: 49-64.
[154] 
Ramaiah MJ, Naushad SM, Lavanya A, et al. Scriptaid cause histone deacetylase inhibition and cell cycle arrest in HeLa cancer cells: A study on structural and functional aspects. Gene  2017; 627: 379-86.
[155] 
Chou K, Shen H. REVIEW: Recent advances in developing web-servers for predicting protein attributes. Nat Sci  2009; 1: 63-92.
[156] 
Jia J, Liu Z, Xiao X, et al. iPPI-Esml: An ensemble classifier for identifying the interactions of proteins by incorporating their physicochemical properties and wavelet transforms into PseAAC. J Theor Biol  2015; 377: 47-56.
[157] 
Liu B, Fang L, Liu F, et al. Identification of real microrna precursors with a pseudo structure status composition approach. PLoS One  2015; 10.
[158] 
Liu B, Fang L, Long R, et al. iEnhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k-tuple nucleotide composition. Bioinformatics  2016; 32: 362-9.
[159] 
Chen W, Ding H, Feng P, et al. IACP: a sequence-based tool for identifying anticancer peptides. Oncotarget  2016; 7: 16895-909.
[160] 
Sharma A, Singla D, Rashid M, et al. Designing of peptides with desired half-life in intestine-like environment. BMC Bioinformatics  2014; 15.
[161] 
Wang L, Li Y, Xu M, et al. Chemical fragment-based CDK4/6 inhibitors prediction and web server. RSC Adv  2016; 6: 16972-81.
Cite As
Current Drug Targets

Application of Machine Learning Approaches for the Design and Study of Anticancer Drugs

Abstract

Graphical Abstract