Machine Learning-based Virtual Screening for STAT3 Anticancer Drug Target

Page: [3023 - 3032] Pages: 10

  • * (Excluding Mailing and Handling)

Abstract

Background: Signal transducers and activators of the transcription (STAT) family is composed of seven structurally similar and highly conserved members, including STAT1, STAT2, STAT3, STAT4, STAT5a, STAT5b, and STAT6. The STAT3 signaling cascade is activated by upstream kinase signals and undergoes phosphorylation, homo-dimerization, nuclear translocation, and DNA binding, resulting in the expression of target genes involved in tumor cell proliferation, metastasis, angiogenesis, and immune editing. STAT3 hyperactivation has been documented in a number of tumors, including head and neck, breast, lung, liver, kidney, prostate, pancreas cancer, multiple myeloma, and acute myeloid leukemia. Drug discovery is a timeconsuming and costly process; it may take ten to fifteen years to bring a single drug to the market. Machine learning algorithms are very fast and effective and commonly used in the field, such as drug discovery. These algorithms are ideal for the virtual screening of large compound libraries to classify molecules as active or inactive.

Objective: The present work aims to perform machine learning-based virtual screening for the STAT3 drug target.

Methods: Machine learning models, such as k-nearest neighbor, support vector machine, Gaussian naïve Bayes, and random forest for classifying the active and inactive inhibitors against a STAT3 drug target, were developed. Ten-fold cross-validation was used for model validation. Then the test dataset prepared from the zinc database was screened using the random forest model. A total of 20 compounds with 88% accuracy was predicted as active against STAT3. Furthermore, these twenty compounds were docked into the active site of STAT3. The two complexes with good docking scores as well as the reference compound were subjected to MD simulation. A total of 100ns MD simulation was performed.

Results: Compared to all other models, the random forest model revealed better results. Compared to the standard reference compound, the top two hits revealed greater stability and compactness.

Conclusion: In conclusion, our predicted hits have the ability to inhibit STAT3 overexpression to combat STAT3-associated diseases.

Keywords: Machine learning, STAT3, virtual screening, docking, MD simulation, drug target.

[1]
Ma J, Qin L, Li X. Role of STAT3 signaling pathway in breast cancer. Cell Commun Signal 2020; 18(1): 33.
[http://dx.doi.org/10.1186/s12964-020-0527-z] [PMID: 32111215]
[2]
Yue P, Turkson J. Targeting STAT3 in cancer: How successful are we? Expert Opin Investig Drugs 2009; 18(1): 45-56.
[http://dx.doi.org/10.1517/13543780802565791] [PMID: 19053881]
[3]
Lee H, Jeong AJ, Ye SK. Highlighted STAT3 as a potential drug target for cancer therapy. BMB Rep 2019; 52(7): 415-23.
[http://dx.doi.org/10.5483/BMBRep.2019.52.7.152] [PMID: 31186087]
[4]
Njatcha C, Farooqui M, Kornberg A, Johnson DE, Grandis JR, Siegfried JM. STAT3 cyclic decoy demonstrates robust antitumor effects in non–small cell lung cancer. Mol Cancer Ther 2018; 17(9): 1917-26.
[http://dx.doi.org/10.1158/1535-7163.MCT-17-1194] [PMID: 29891486]
[5]
Kessler D, Gmachl M, Mantoulidis M, et al. Drugging an undruggable pocket on KRAS. Proc Natl Acad Sci 2019; 116(32): 15823-9.
[http://dx.doi.org/10.1073/pnas.1904529116]
[6]
Kujawski M, Kortylewski M, Lee H, Herrmann A, Kay H, Yu H. STAT3 mediates myeloid cell–dependent tumor angiogenesis in mice. J Clin Invest 2008; 118(10): 3367-77.
[http://dx.doi.org/10.1172/JCI35213] [PMID: 18776941]
[7]
Thakur R, Trivedi R, Rastogi N, Singh M, Mishra DP. Inhibition of STAT3, FAK and Src mediated signaling reduces cancer stem cell load, tumorigenic potential and metastasis in breast cancer. Sci Rep 2015; 5(1): 10194.
[http://dx.doi.org/10.1038/srep10194] [PMID: 25973915]
[8]
Poli G, Gelain A, Porta F, Asai A, Martinelli A, Tuccinardi T. Identification of a new STAT3 dimerization inhibitor through a pharmacophore-based virtual screening approach. J Enzyme Inhib Med Chem 2016; 31(6): 1011-7.
[http://dx.doi.org/10.3109/14756366.2015.1079184] [PMID: 26308397]
[9]
Singh P, Bast F. High-throughput virtual screening, identification and in vitro biological evaluation of novel inhibitors of signal transducer and activator of transcription 3. Med Chem Res 2015; 24(6): 2694-708.
[http://dx.doi.org/10.1007/s00044-015-1328-6]
[10]
Herrera-Acevedo C, Perdomo-Madrigal C, Herrera-Acevedo K, Coy-Barrera E, Scotti L, Scotti MT. Machine learning models to select potential inhibitors of acetylcholinesterase activity from SistematX: A natural products database. Mol Divers 2021; 25(3): 1553-68.
[http://dx.doi.org/10.1007/s11030-021-10245-z] [PMID: 34132933]
[11]
Korkmaz S, Zararsiz G, Goksuluk D. MLViS: A web tool for machine learning-based virtual screening in early-phase of drug discovery and development. PLoS One 2015; 10(4): e0124600.
[http://dx.doi.org/10.1371/journal.pone.0124600] [PMID: 25928885]
[12]
Zhang Y, Qiu Y, Cui Y, Liu S, Zhang W. Predicting drug-drug interactions using multi-modal deep auto-encoders based network embedding and positive-unlabeled learning. Methods 2020; 179: 37-46.
[http://dx.doi.org/10.1016/j.ymeth.2020.05.007] [PMID: 32497603]
[13]
Roth GA, Johnson C, Abajobir A, et al. Global, regional, and national burden of cardiovascular diseases for 10 causes, 1990 to 2015. J Am Coll Cardiol 2017; 70(1): 1-25.
[http://dx.doi.org/10.1016/j.jacc.2017.04.052] [PMID: 28527533]
[14]
Golino HF, de Brito Amaral LS, Duarte SFP, et al. Predicting increased blood pressure using machine learning. J Obes 2014; 2014: 637635.
[http://dx.doi.org/10.1155/2014/637635]
[15]
Koshimizu H, Kojima R, Kario K, Okuno Y. Prediction of blood pressure variability using deep neural networks. Int J Med Inform 2020; 136: 104067.
[http://dx.doi.org/10.1016/j.ijmedinf.2019.104067] [PMID: 31955052]
[16]
Wajngarten M, Silva GS. Hypertension and stroke: Update on treatment. Eur Cardiol 2019; 14(2): 111-5.
[http://dx.doi.org/10.15420/ecr.2019.11.1] [PMID: 31360232]
[17]
Sacks D, Baxter B, Campbell BCV, et al. Multisociety consensus quality improvement revised consensus statement for endovascular therapy of acute ischemic stroke. Int J Stroke 2018; 13(6): 612-32.
[PMID: 29786478]
[18]
Marbun J, Andayani U. Classification of stroke disease using convolutional neural network. J Phy Conf Ser 2018; 978(1): 012092.
[http://dx.doi.org/10.1088/1742-6596/978/1/012092]
[19]
Che J, Feng R, Gao J, et al. Evaluation of artificial intelligence in participating structure-based virtual screening for identifying novel interleukin-1 receptor associated kinase-1 inhibitors. Front Oncol 2020; 10: 1769.
[http://dx.doi.org/10.3389/fonc.2020.01769] [PMID: 33014870]
[20]
Gupta B, Negi M, Vishwakarma K, et al. Study of Twitter sentiment analysis using machine learning algorithms on Python. Int J Comput Appl 2017; 165(9): 29-34.
[http://dx.doi.org/10.5120/ijca2017914022]
[21]
Qin JJ, Yan L, Zhang J, Zhang WD. STAT3 as a potential therapeutic target in triple negative breast cancer: A systematic review. J Exp Clin Cancer Res 2019; 38(1): 195.
[http://dx.doi.org/10.1186/s13046-019-1206-z] [PMID: 31088482]
[22]
Beyreis M, Gaisberger M, Jakab M, et al. The cancer stem cell inhibitor napabucasin (BBI608) shows general cytotoxicity in biliary tract cancer cells and reduces cancer stem cell characteristics. Cancers 2019; 11(3): 276.
[http://dx.doi.org/10.3390/cancers11030276] [PMID: 30813586]
[23]
Agarwal V. Research on data preprocessing and categorization technique for smartphone review analysis. Int J Comput Appl 2015; 131(4): 30-6.
[http://dx.doi.org/10.5120/ijca2015907309]
[24]
Melville J, Burke E, Hirst J. Machine learning in virtual screening. Comb Chem High Throughput Screen 2009; 12(4): 332-43.
[http://dx.doi.org/10.2174/138620709788167980] [PMID: 19442063]
[25]
Ahmad A, Akbar S, Hayat M, Ali F. khan S, Sohail M. Identification of antioxidant proteins using a discriminative intelligent model of k-space amino acid pairs based descriptors incorporating with ensemble feature selection. Biocybern Biomed Eng 2022; 42: 727-35.
[http://dx.doi.org/10.1016/j.bbe.2020.10.003]
[26]
Noi TP, Kappas M. Comparison of random forest, k-Nearest Neighbor, and support vector machine classifiers for land cover classification using sentinel-2 imagery. Sensors 2017; 18(2): 18.
[http://dx.doi.org/10.3390/s18010018] [PMID: 29271909]
[27]
Ahmad I, Basheri M, Iqbal MJ, Rahim A. Performance comparison of support vector machine, random forest, and extreme learning machine for intrusion detection. IEEE Access 2018; 6: 33789-95.
[http://dx.doi.org/10.1109/ACCESS.2018.2841987]
[28]
Ali M, Aittokallio T. Machine learning and feature selection for drug response prediction in precision oncology applications. Biophys Rev 2019; 11(1): 31-9.
[http://dx.doi.org/10.1007/s12551-018-0446-z] [PMID: 30097794]
[29]
Granitto P, Furlanello C, Biasioli F, et al. Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products. Chemometrics Intell Lab Syst 2006; 83(2): 83-90.
[30]
Kabir M, Arif M, Ahmad S, Ali Z, Swati ZNK, Yu D-J. Intelligent computational method for discrimination of anticancer peptides by incorporating sequential and evolutionary profiles information. Chemom Intell Lab Syst 2018; 182: 158-65.
[http://dx.doi.org/10.1016/j.chemolab.2018.09.007]
[31]
Ghori KM, Abbasi RA, Awais M, Imran M, Ullah A, Szathmary L. Performance analysis of different types of machine learning classifiers for non-technical loss detection. IEEE Access 2020; 8: 16033-48.
[http://dx.doi.org/10.1109/ACCESS.2019.2962510]
[32]
Lagunin AA, Dubovskaja VI, Rudik AV, et al. CLC-Pred: A freely available web-service for in silico prediction of human cell line cytotoxicity for drug-like compounds. PLoS One 2018; 13(1): e0191838.
[http://dx.doi.org/10.1371/journal.pone.0191838] [PMID: 29370280]
[33]
Paramashivam SK, Elayaperumal K, Natarajan B, Ramamoorthy M, Balasubramanian S, Dhiraviam K. In silico pharmacokinetic and molecular docking studies of small molecules derived from Indigofera aspalathoides Vahl targeting receptor tyrosine kinases. Bioinformation 2015; 11(2): 73-84.
[http://dx.doi.org/10.6026/97320630011073] [PMID: 25848167]
[34]
Husain A, Ahmad A, Khan SA, Asif M, Bhutani R, Al-Abbasi FA. Synthesis, molecular properties, toxicity and biological evaluation of some new substituted imidazolidine derivatives in search of potent anti-inflammatory agents. Saudi Pharm J 2016; 24(1): 104-14.
[http://dx.doi.org/10.1016/j.jsps.2015.02.008] [PMID: 26903774]
[35]
Teimouri M, Junaid M, Saleem S, Khan A, Ali A. In-vitro analysis of selective nutraceuticals binding to human transcription factors through computer aided molecular docking predictions. Bioinformation 2016; 12(7): 354-8.
[http://dx.doi.org/10.6026/97320630012354] [PMID: 28246465]
[36]
Yang C, Yang Z, Tong K, et al. Homology modeling and molecular docking simulation of martentoxin as a specific inhibitor of the BK channel. Ann Transl Med 2022; 10(2): 71.
[http://dx.doi.org/10.21037/atm-21-6967] [PMID: 35282126]
[37]
Pearlman DA, Case DA, Caldwell JW, et al. AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules. Comput Phys Commun 1995; 91(1-3): 1-41.
[http://dx.doi.org/10.1016/0010-4655(95)00041-D]
[38]
Wang J, Wang W, Kollman PA, et al. Antechamber: An accessory software package for molecular mechanical calculations. J Am Chem Soc 2001; 222: U403.
[39]
Vassetti D, Pagliai M, Procacci P. Assessment of GAFF2 and OPLS-AA general force fields in combination with the water models TIP3P, SPCE, and OPC3 for the solvation free energy of druglike organic molecules. J Chem Theory Comput 2019; 15(3): 1983-95.
[http://dx.doi.org/10.1021/acs.jctc.8b01039] [PMID: 30694667]
[40]
Lin Y, Pan D, Li J, Zhang L, Shao X. Application of Berendsen barostat in dissipative particle dynamics for nonequilibrium dynamic simulation. J Chem Phys 2017; 146(12): 124108.
[http://dx.doi.org/10.1063/1.4978807] [PMID: 28388109]
[41]
Rigsby RE, Parker AB. Using the PyMOL application to reinforce visual understanding of protein structure. Biochem Mol Biol Educ 2016; 44(5): 433-7.
[http://dx.doi.org/10.1002/bmb.20966] [PMID: 27241834]
[42]
Moberly JG, Bernards MT, Waynant KV. Key features and updates for Origin 2018. J Cheminform 2018; 10(1): 5.
[http://dx.doi.org/10.1186/s13321-018-0259-x] [PMID: 29427195]
[43]
Wilkinson L, Friendly M. The history of the cluster heat map. Am Stat 2009; 63(2): 179-84.
[http://dx.doi.org/10.1198/tas.2009.0033]
[44]
Junaid M, Li CD, Shah M, Khan A, Guo H, Wei DQ. Extraction of molecular features for the drug discovery targeting protein‐protein interaction of Helicobacter pylori CagA and tumor suppressor protein ASSP2. Proteins 2019; 87(10): 837-49.
[http://dx.doi.org/10.1002/prot.25748] [PMID: 31134671]
[45]
Levy DE, Inghirami G. STAT3: A multifaceted oncogene. Proc Natl Acad Sci 2006; 103(27): 10151-2.
[http://dx.doi.org/10.1073/pnas.0604042103] [PMID: 16801534]
[46]
Tolomeo M, Cascio A. The multifaced role of STAT3 in cancer and its implication for anticancer therapy. Int J Mol Sci 2021; 22(2): 603.
[http://dx.doi.org/10.3390/ijms22020603] [PMID: 33435349]
[47]
Lau YK, Ramaiyer M, Johnson DE, Grandis JR. Targeting STAT3 in cancer with nucleotide therapeutics. Cancers 2019; 11(11): 1681.
[http://dx.doi.org/10.3390/cancers11111681] [PMID: 31671769]
[48]
Guha P, Gardell J, Darpolor J, et al. STAT3 inhibition induces Bax-dependent apoptosis in liver tumor myeloid-derived suppressor cells. Oncogene 2019; 38(4): 533-48.
[http://dx.doi.org/10.1038/s41388-018-0449-z] [PMID: 30158673]
[49]
Kim D, Lee YH, Hwang HY, Kim K, Park HJ. Z-DNA binding proteins as targets for structure-based virtual screening. Curr Drug Targets 2010; 11(3): 335-44.
[http://dx.doi.org/10.2174/138945010790711905] [PMID: 20210758]
[50]
Batool M, Ahmad B, Choi S. A structure-based drug discovery paradigm. Int J Mol Sci 2019; 20(11): 2783.
[http://dx.doi.org/10.3390/ijms20112783] [PMID: 31174387]
[51]
Han L, Wang Y, Bryant SH. Developing and validating predictive decision tree models from mining chemical structural fingerprints and high-throughput screening data in PubChem. BMC Bioinformatics 2008; 9(1): 401.
[http://dx.doi.org/10.1186/1471-2105-9-401] [PMID: 18817552]