LSTM-SAGDTA: Predicting Drug-target Binding Affinity with an Attention Graph Neural Network and LSTM Approach

Page: [468 - 476] Pages: 9

  • * (Excluding Mailing and Handling)

Abstract

Introduction: Drug development is a challenging and costly process, yet it plays a crucial role in improving healthcare outcomes. Drug development requires extensive research and testing to meet the demands for economic efficiency, cures, and pain relief.

Methods: Drug development is a vital research area that necessitates innovation and collaboration to achieve significant breakthroughs. Computer-aided drug design provides a promising avenue for drug discovery and development by reducing costs and improving the efficiency of drug design and testing.

Results: In this study, a novel model, namely LSTM-SAGDTA, capable of accurately predicting drug-target binding affinity, was developed. We employed SeqVec for characterizing the protein and utilized the graph neural networks to capture information on drug molecules. By introducing self-attentive graph pooling, the model achieved greater accuracy and efficiency in predicting drug-target binding affinity.

Conclusion: Moreover, LSTM-SAGDTA obtained superior accuracy over current state-of-the-art methods only by using less training time. The results of experiments suggest that this method represents a highprecision solution for the DTA predictor.

[1]
Davis MI, Hunt JP, Herrgard S, et al. Comprehensive analysis of kinase inhibitor selectivity. Nat Biotechnol 2011; 29(11): 1046-51.
[http://dx.doi.org/10.1038/nbt.1990] [PMID: 22037378]
[2]
Mullard A. New drugs cost US$2.6 billion to develop. Nat Rev Drug Discov 2014; 13(12): 877-7.
[http://dx.doi.org/10.1038/nrd4507] [PMID: 25435204]
[3]
Cohen P. Protein kinases - the major drug targets of the twenty-first century? Nat Rev Drug Discov 2002; 1(4): 309-15.
[http://dx.doi.org/10.1038/nrd773] [PMID: 12120282]
[4]
Deshpande M, Kuramochi M, Wale N, Karypis G. Frequent substructure-based approaches for classifying chemical compounds. IEEE Trans Knowl Data Eng 2005; 17(8): 1036-50.
[http://dx.doi.org/10.1109/TKDE.2005.127]
[5]
Gu R, Wu F, Huang Z. Role of computer-aided drug design in drug development. Molecules 2023; 28(20): 7160.
[http://dx.doi.org/10.3390/molecules28207160] [PMID: 37894639]
[6]
Seo S, Youn W. PharmacoNet: Accelerating large-scale virtual screening by deep pharmacophore modeling. arXiv:231000681 2023.
[7]
Chu Y, Kaushik AC, Wang X, et al. DTI-CDF: A cascade deep forest model towards the prediction of drug-target interactions based on hybrid features. Brief Bioinform 2021; 22(1): 451-62.
[http://dx.doi.org/10.1093/bib/bbz152] [PMID: 31885041]
[8]
Keiser MJ, Setola V, Irwin JJ, et al. Predicting new molecular targets for known drugs. Nature 2009; 462(7270): 175-81.
[http://dx.doi.org/10.1038/nature08506] [PMID: 19881490]
[9]
Hughes JP, Rees S, Kalindjian SB, Philpott KL. Principles of early drug discovery. Br J Pharmacol 2011; 162(6): 1239-49.
[http://dx.doi.org/10.1111/j.1476-5381.2010.01127.x] [PMID: 21091654]
[10]
Salo-Ahen OMH, Alanko I, Bhadane R, et al. Molecular dynamics simulations in drug discovery and pharmaceutical development. Processes 2020; 9(1): 71.
[http://dx.doi.org/10.3390/pr9010071]
[11]
Kairys V, Baranauskiene L, Kazlauskiene M, Matulis D, Kazlauskas E. Binding affinity in drug design: Experimental and computational techniques. Expert Opin Drug Discov 2019; 14(8): 755-68.
[http://dx.doi.org/10.1080/17460441.2019.1623202] [PMID: 31146609]
[12]
Lang PT, Brozell SR, Mukherjee S, et al. DOCK 6: Combining techniques to model RNA-small molecule complexes. RNA 2009; 15(6): 1219-30.
[http://dx.doi.org/10.1261/rna.1563609] [PMID: 19369428]
[13]
Morris GM, Huey R, Lindstrom W, et al. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J Comput Chem 2009; 30(16): 2785-91.
[http://dx.doi.org/10.1002/jcc.21256] [PMID: 19399780]
[14]
Hartshorn MJ, Verdonk ML, Chessari G, et al. Diverse, high-quality test set for the validation of protein-ligand docking performance. J Med Chem 2007; 50(4): 726-41.
[http://dx.doi.org/10.1021/jm061277y] [PMID: 17300160]
[15]
Cichonska A, Ravikumar B, Parri E, et al. Computational-experimental approach to drug-target interaction mapping: A case study on kinase inhibitors. PLOS Comput Biol 2017; 13(8): e1005678.
[http://dx.doi.org/10.1371/journal.pcbi.1005678] [PMID: 28787438]
[16]
He T, Heidemeyer M, Ban F, Cherkasov A, Ester M. SimBoost: A read-across approach for predicting drug-target binding affinities using gradient boosting machines. J Cheminform 2017; 9(1): 24.
[http://dx.doi.org/10.1186/s13321-017-0209-z] [PMID: 29086119]
[17]
Thafar M, Raies AB, Albaradei S, Essack M, Bajic VB. Comparison study of computational prediction tools for drug-target binding affinities. Front Chem 2019; 7(7): 782.
[http://dx.doi.org/10.3389/fchem.2019.00782] [PMID: 31824921]
[18]
Tang B, Pan Z, Yin K, Khateeb A. Recent advances of deep learning in bioinformatics and computational biology. Front Genet 2019; 10(10): 214.
[http://dx.doi.org/10.3389/fgene.2019.00214] [PMID: 30972100]
[19]
Öztürk H, Özgür A, Ozkirimli E. DeepDTA: Deep drug-target binding affinity prediction. Bioinformatics 2018; 34(17): i821-9.
[http://dx.doi.org/10.1093/bioinformatics/bty593] [PMID: 30423097]
[20]
Tang J, Szwajda A, Shakyawar S, et al. Making sense of large-scale kinase inhibitor bioactivity data sets: A comparative and integrative analysis. J Chem Inf Model 2014; 54(3): 735-43.
[http://dx.doi.org/10.1021/ci400709d] [PMID: 24521231]
[21]
Öztürk H, Ozkirimli E, Özgür A. WideDTA: Prediction of drug-target binding affinity. arXiv: Quantitative Methods 2019; 1902.04166.
[22]
Woźniak M, Wołos A, Modrzyk U, et al. Linguistic measures of chemical diversity and the “keywords” of molecular collections. Sci Rep 2018; 8(1): 7598.
[http://dx.doi.org/10.1038/s41598-018-25440-6] [PMID: 29765058]
[23]
Sigrist CJA, Cerutti L, de Castro E, et al. PROSITE, a protein domain database for functional characterization and annotation. Nucleic Acids Res 2010; 38(S1): D161-6.
[http://dx.doi.org/10.1093/nar/gkp885] [PMID: 19858104]
[24]
Karimi M, Wu D, Wang Z, Shen Y. DeepAffinity: Interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks. Bioinformatics 2019; 35(18): 3329-38.
[http://dx.doi.org/10.1093/bioinformatics/btz111] [PMID: 30768156]
[25]
Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 1988; 28(1): 31-6.
[http://dx.doi.org/10.1021/ci00057a005]
[26]
Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada. 2014; pp. 3104-12.
[27]
Zhao L, Wang J, Pang L, Liu Y, Zhang J. GANsDTA: Predicting drug-target binding affinity using GANs. Front Genet 2020; 10: 1243.
[http://dx.doi.org/10.3389/fgene.2019.01243] [PMID: 31993067]
[28]
Kalemati M, Emani ZM, Koohi S. BiComp-DTA: Drug-target binding affinity prediction through complementary biological-related and compression-based featurization approach. PLOS Comput Biol 2023; 19(3): e1011036.
[http://dx.doi.org/10.1371/journal.pcbi.1011036] [PMID: 37000857]
[29]
Zhang H, Saravanan KM, Zhang JZH. DeepBindGCN: Integrating molecular vector representation with graph convolutional neural networks for protein-ligand interaction prediction. Molecules 2023; 28(12): 4691.
[http://dx.doi.org/10.3390/molecules28124691] [PMID: 37375246]
[30]
Son J, Kim D. Development of a graph convolutional neural network model for efficient prediction of protein-ligand binding affinities. PLoS One 2021; 16(4): e0249404.
[http://dx.doi.org/10.1371/journal.pone.0249404] [PMID: 33831016]
[31]
Xia Y, Xia CQ, Pan X, Shen HB. GraphBind: Protein structural context embedded rules learned by hierarchical graph neural networks for recognizing nucleic-acid-binding residues. Nucleic Acids Res 2021; 49(9): e51.
[http://dx.doi.org/10.1093/nar/gkab044] [PMID: 33577689]
[32]
Dubourg-Felonneau G. Improving protein subcellular localization prediction with structural prediction & graph neural networks. bioRxiv 2022.
[http://dx.doi.org/10.1101/2022.11.29.518403]
[33]
Cai J, Wang T, Deng X, Tang L, Liu L. GM-lncLoc: LncRNAs subcellular localization prediction based on graph neural network with meta-learning. BMC Genomics 2023; 24(1): 52.
[http://dx.doi.org/10.1186/s12864-022-09034-1] [PMID: 36709266]
[34]
Nguyen T, Le H, Quinn TP, Nguyen T, Le TD, Venkatesh S. GraphDTA: predicting drug-target binding affinity with graph neural networks. Bioinformatics 2021; 37(8): 1140-7.
[http://dx.doi.org/10.1093/bioinformatics/btaa921] [PMID: 33119053]
[35]
Kipf T, Welling M. Semi-supervised classification with graph convolutional networks. ArXiv 2016; abs/160902907 2016.
[36]
Veličković P. Graph attention networks. International Conference on Learning Representations (ICLR).
[37]
Xu K, Weihua H, Leskovec J. How powerful are graph neural networks? arXiv preprint arXiv:181000826 2018.
[38]
Lin X. DeepGS: Deep representation learning of graphs and sequences for drug-target binding affinity prediction. ArXiv abs/200313902 2020.
[39]
Asgari E, Mofrad MRK. Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS One 2015; 10(11): e0141287.
[http://dx.doi.org/10.1371/journal.pone.0141287] [PMID: 26555596]
[40]
Quan Z. A system for learning atoms based on long short-term memory recurrent neural networks. 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).
[http://dx.doi.org/10.1109/BIBM.2018.8621313]
[41]
Vaswani A, Shazeer N, Parmar N. Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, California, USA . 2017; pp. 6000-10.
[42]
Xiong Z, Wang D, Liu X, et al. Pushing the boundaries of molecular representation for drug discovery with graph attention mechanism. J Med Chem 2020; 63(16): 8749-60.
[http://dx.doi.org/10.1021/acs.jmedchem.9b00959] [PMID: 31408336]
[43]
Zhao Q, Xiao F, Yang M. AttentionDTA: Prediction of drug-target binding affinity using attention model. 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).
[http://dx.doi.org/10.1109/BIBM47256.2019.8983125]
[44]
Lim J, Ryu S, Park K, Choe YJ, Ham J, Kim WY. Predicting drug-target interaction using a novel graph neural network with 3D structure-embedded graph representation. J Chem Inf Model 2019; 59(9): 3981-8.
[http://dx.doi.org/10.1021/acs.jcim.9b00387] [PMID: 31443612]
[45]
Lee J, Lee I, Kang J. Self-attention graph pooling. In: Kamalika C, Ruslan S, Eds. Proceedings of the 36th International Conference on Machine Learning. 2019; pp. 3734-43.
[46]
Liu C, Zhan Y, Yu B, et al. On exploring node-feature and graph-structure diversities for node drop graph pooling. Neural Netw 2023; 167: 559-71.
[http://dx.doi.org/10.1016/j.neunet.2023.08.046] [PMID: 37696073]
[47]
Zhang S, Wang J, Yu S, et al. An explainable deep learning framework for characterizing and interpreting human brain states. Med Image Anal 2023; 83: 102665.
[http://dx.doi.org/10.1016/j.media.2022.102665] [PMID: 36370512]
[48]
Zhang S, Wang R, Wang J, et al. Differentiate preterm and term infant brains and characterize the corresponding biomarkers via DICCCOL-based multi-modality graph neural networks. Front Neurosci 2022; 16: 951508.
[http://dx.doi.org/10.3389/fnins.2022.951508] [PMID: 36312010]
[49]
Wang R, Fang X, Lu Y, Yang CY, Wang S. The PDBbind database: Methodologies and updates. J Med Chem 2005; 48(12): 4111-9.
[http://dx.doi.org/10.1021/jm048957q] [PMID: 15943484]
[50]
Wang R, Fang X, Lu Y, Wang S. The PDBbind database: Collection of binding affinities for protein-ligand complexes with known three-dimensional structures. J Med Chem 2004; 47(12): 2977-80.
[http://dx.doi.org/10.1021/jm030580l] [PMID: 15163179]
[51]
Kinjo AR, Bekker GJ, Suzuki H, et al. Protein Data Bank Japan (PDBj): Updated user interfaces, resource description framework, analysis tools for large structures. Nucleic Acids Res 2017; 45(D1): D282-8.
[http://dx.doi.org/10.1093/nar/gkw962] [PMID: 27789697]
[52]
Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol 1981; 147(1): 195-7.
[http://dx.doi.org/10.1016/0022-2836(81)90087-5] [PMID: 7265238]
[53]
Zdrazil B, Felix E, Hunter F, et al. The ChEMBL Database in 2023: A drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Res 2024; 52(D1): D1180-92.
[http://dx.doi.org/10.1093/nar/gkad1004] [PMID: 37933841]
[54]
Mendez D, Gaulton A, Bento AP, et al. ChEMBL: Towards direct deposition of bioassay data. Nucleic Acids Res 2019; 47(D1): D930-40.
[http://dx.doi.org/10.1093/nar/gky1075] [PMID: 30398643]
[55]
Gaulton A, Hersey A, Nowotka M, et al. The ChEMBL database in 2017. Nucleic Acids Res 2017; 45(D1): D945-54.
[http://dx.doi.org/10.1093/nar/gkw1074] [PMID: 27899562]
[56]
Gaulton A, Bellis LJ, Bento AP, et al. ChEMBL: A large-scale bioactivity database for drug discovery. Nucleic Acids Res 2012; 40(D1): D1100-7.
[http://dx.doi.org/10.1093/nar/gkr777] [PMID: 21948594]
[57]
Davies M, Nowotka M, Papadatos G, et al. ChEMBL web services: Streamlining access to drug discovery data and utilities. Nucleic Acids Res 2015; 43(W1): W612-20.
[http://dx.doi.org/10.1093/nar/gkv352] [PMID: 25883136]
[58]
Bento AP, Gaulton A, Hersey A, et al. The ChEMBL bioactivity database: An update. Nucleic Acids Res 2014; 42(D1): D1083-90.
[http://dx.doi.org/10.1093/nar/gkt1031] [PMID: 24214965]
[59]
Szklarczyk D, Santos A, von Mering C, Jensen LJ, Bork P, Kuhn M. STITCH 5: Augmenting protein-chemical interaction networks with tissue and affinity data. Nucleic Acids Res 2016; 44(D1): D380-4.
[http://dx.doi.org/10.1093/nar/gkv1277] [PMID: 26590256]
[60]
Kuhn M, Szklarczyk D, Pletscher-Frankild S, et al. STITCH 4: Integration of protein-chemical interactions with user data. Nucleic Acids Res 2014; 42(D1): D401-7.
[http://dx.doi.org/10.1093/nar/gkt1207] [PMID: 24293645]
[61]
Kuhn M, Szklarczyk D, Franceschini A, von Mering C, Jensen LJ, Bork P. STITCH 3: Zooming in on protein-chemical interactions. Nucleic Acids Res 2012; 40(D1): D876-80.
[http://dx.doi.org/10.1093/nar/gkr1011] [PMID: 22075997]
[62]
Kuhn M, Szklarczyk D, Franceschini A, et al. STITCH 2: An interaction network database for small molecules and proteins. Nucleic Acids Res 2010; 38(S1): D552-6.
[http://dx.doi.org/10.1093/nar/gkp937] [PMID: 19897548]
[63]
Kuhn M, von Mering C, Campillos M, Jensen LJ, Bork P. STITCH: Interaction networks of chemicals and proteins. Nucleic Acids Res 2008; 36(Database issue): D684-8.
[PMID: 18084021]
[64]
Wang YB, You ZH, Yang S, Yi HC, Chen ZH, Zheng K. A deep learning-based method for drug-target interaction prediction based on long short-term memory neural network. BMC Med Inform Decis Mak 2020; 20(S2): 49.
[http://dx.doi.org/10.1186/s12911-020-1052-0] [PMID: 32183788]
[65]
RDKit: Open-source cheminformatics. Available from: https://www.rdkit.org
[66]
Heinzinger M, Elnaggar A, Wang Y, et al. Modeling aspects of the language of life through transfer-learning protein sequences. BMC Bioinformatics 2019; 20(1): 723.
[http://dx.doi.org/10.1186/s12859-019-3220-8] [PMID: 31847804]
[67]
Hirohara M, Saito Y, Koda Y, Sato K, Sakakibara Y. Convolutional neural network based on SMILES representation of compounds for detecting chemical motif. BMC Bioinformatics 2018; 19(S19): 526.
[http://dx.doi.org/10.1186/s12859-018-2523-5] [PMID: 30598075]
[68]
Peters ME. Deep Contextualized Word Representations. New Orleans, Louisiana: Association for Computational Linguistics 2018.
[http://dx.doi.org/10.18653/v1/N18-1202]
[69]
Kim Y. Character-aware neural language models. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. Phoenix, Arizona. 2016; pp. 2741-9.
[70]
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997; 9(8): 1735-80.
[http://dx.doi.org/10.1162/neco.1997.9.8.1735] [PMID: 9377276]
[71]
Ramsundar B. Deep Learning for the Life Sciences: Applying Deep Learning to Genomics. Microscopy, Drug Discovery, and More 2019.
[72]
Gönen M, Heller G. Concordance probability and discriminatory power in proportional hazards regression. Biometrika 2005; 92(4): 965-70.
[http://dx.doi.org/10.1093/biomet/92.4.965]
[73]
Yuan W, Chen G, Chen CYC. FusionDTA: Attention-based feature polymerizer and knowledge distillation for drug-target binding affinity prediction. Brief Bioinf 2022; 23(1): bbab506.
[http://dx.doi.org/10.1093/bib/bbab506] [PMID: 34929738]
[74]
Jiang M, Li Z, Zhang S, et al. Drug-target affinity prediction using graph neural network and contact maps. RSC Advances 2020; 10(35): 20701-12.
[http://dx.doi.org/10.1039/D0RA02297G] [PMID: 35517730]