Chronological Order Based Wrapper Technique for Drug-Target Interaction Prediction (CO-WT DTI)

Page: [541 - 557] Pages: 17

  • * (Excluding Mailing and Handling)

Abstract

Background: Drug-Target Interactions (DTIs) are used to suggest new medications for diseases or reuse existing drugs to treat other diseases since experimental procedures take years to complete, and FDA (Food and Drug Administration) permission is necessary for drugs to be made available in the market.

Objective: Computational methods are favoured over wet-lab experiments in drug analysis, considering that the process is tedious, time-consuming, and costly. The interactions between drug targets are computationally identified, paving the way for unknown drug-target interactions for numerous diseases unknown to researchers.

Methods: This paper presents a Chronological Order-based Wrapper Technique for Drug-Target Interaction prediction (CO-WT DTI) to discover novel DTI. In our proposed approach, drug features, as well as protein features, are obtained by three feature extraction techniques while dimensionality reduction is implemented to remove unfavourable features. The imbalance issue is taken care of by balancing methods while the performance of the proposed approach is validated on benchmark datasets.

Results: The proposed approach has been validated using four broadly used benchmark datasets, namely, GPCR (G protein-coupled receptors), enzymes, nuclear receptors, and ion channels. Our experimental results outperform other state-of-the-art methods based on the AUC (area under the Receiver Operating Characteristic (ROC) curve) metric, and Leave-One-Out Cross-Validation (LOOCV) is used to evaluate the prediction performance of the proposed approach.

Conclusion: The performance of feature extraction, balancing methods, dimensionality reduction, and classifier suggests ways to contribute data to the development of new drugs. It is anticipated that our model will help refine ensuing explorations, especially in the drug-target interaction domain.

Keywords: LPC, drug-target interactions, MSF, over-sampling SMOTE, random under-sampling, k-separated-bigrams PSSM, AADP-PSSM, XGBoost classifier.

[1]
Landry Y, Gies J-P. Drugs and their molecular targets: an updated overview. Fundam Clin Pharmacol 2008; 22(1): 1-18.
[http://dx.doi.org/10.1111/j.1472-8206.2007.00548.x] [PMID: 18251718]
[2]
Li Q, Lai L. Prediction of potential drug targets based on simple sequence properties. BMC Bioinformatics 2007; 8(1): 353.
[http://dx.doi.org/10.1186/1471-2105-8-353] [PMID: 17883836]
[3]
Iqbal S, Ahmad S, Bano B, Akkour K, Alghamdi MA, Alothri AM. A systematic review: Role of artificial intelligence during the COVID-19 pandemic in the healthcare system. Int J Intell Inf Technol 2021; 17(1): 1-18.
[http://dx.doi.org/10.4018/IJIIT.2021010101]
[4]
David SW, Craig Knox, An Chi Guo, et al. DrugBank: A knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Research 2008; 36(suppl. 1): D901-6.
[http://dx.doi.org/10.1093/nar/gkm958]
[5]
Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000; 28(1): 27-30.
[http://dx.doi.org/10.1093/nar/28.1.27] [PMID: 10592173]
[6]
Bento AP, Gaulton A, Hersey A, et al. The ChEMBL bioactivity database: an update. Nucleic Acids Res 2014; 42(Database issue): D1083-90.
[http://dx.doi.org/10.1093/nar/gkt1031] [PMID: 24214965]
[7]
Zhu Feng, Han BuCong, Pankaj Kumar, et al. Update of TTD: Therapeutic target database. Nucleic Acids Res 2010; 38(suppl_ 1): D787-91.
[http://dx.doi.org/10.1093/nar/gkp1014]
[8]
Kuhn M, Szklarczyk D, Pletscher-Frankild S, et al. STITCH 4: integration of protein-chemical interactions with user data. Nucleic Acids Res 2014; 42(Database issue): D401-7.
[http://dx.doi.org/10.1093/nar/gkt1207] [PMID: 24293645]
[9]
Mitchell JB. The relationship between the sequence identities of alpha helical proteins in the PDB and the molecular similarities of their ligands. J Chem Inf Comput Sci 2001; 41(6): 1617-22.
[http://dx.doi.org/10.1021/ci010364q] [PMID: 11749588]
[10]
Ferreira LG, Dos Santos RN, Oliva G, Andricopulo AD. Molecular docking and structure-based drug design strategies. Molecules 2015; 20(7): 13384-421.
[http://dx.doi.org/10.3390/molecules200713384] [PMID: 26205061]
[11]
Pellecchia M, Bertini I, Cowburn D, et al. Perspectives on NMR in drug discovery: A technique comes of age. Nat Rev Drug Discov 2008; 7(9): 738-45.
[http://dx.doi.org/10.1038/nrd2606]
[12]
Masood MMD, Manjula D, Sugumaran V. Identification of new disease genes from protein–protein interaction network. J Ambient Intell Human Comput 2018; pp. 1-9.
[http://dx.doi.org/10.1007/s12652-018-0788-1]
[13]
Chen X, Yan CC, Zhang X, et al. Drug-target interaction prediction: databases, web servers and computational models. Brief Bioinform 2016; 17(4): 696-712.
[http://dx.doi.org/10.1093/bib/bbv066] [PMID: 26283676]
[14]
Yamanishi Y, Kotera M, Kanehisa M, Goto S. Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework. Bioinformatics 2010; 26(12): i246-54.
[http://dx.doi.org/10.1093/bioinformatics/btq176] [PMID: 20529913]
[15]
Li Z, Han P, You ZH, et al. In silico prediction of drug-target interaction networks based on drug chemical structure and protein sequences. Sci Rep 2017; 7(1): 11174.
[http://dx.doi.org/10.1038/s41598-017-10724-0] [PMID: 28894115]
[16]
Cao D-S, Liu S, Xu Q-S, et al. Large-scale prediction of drug-target interactions using protein sequences and drug topological structures. Anal Chim Acta 2012; 752: 1-10.
[http://dx.doi.org/10.1016/j.aca.2012.09.021] [PMID: 23101647]
[17]
Huang YA, You ZH, Chen X. A systematic prediction of drug-target interactions using molecular fingerprints and protein sequences. Curr Protein Pept Sci 2018; 19(5): 468-78.
[http://dx.doi.org/10.2174/1389203718666161122103057] [PMID: 27875970]
[18]
Saini H, Raicar G, Sharma A, et al. Protein Structural Class Prediction via k-separated bigrams using position specific scoring matrix. J Adv Comp Intelligence Intelligent Inform 2014; 18(4): 474-9.
[http://dx.doi.org/10.20965/jaciii.2014.p0474]
[19]
Qin Y, Zheng X, Wang J, Chen M, Zhou C. Prediction of protein structural class based on Linear Predictive Coding of PSI-BLAST pro-files. Open Life Sci 2015; 10(1): 529-36.
[http://dx.doi.org/10.1515/biol-2015-0055]
[20]
Mousavian Z, Khakabimamaghani S, Kavousi K, Masoudi-Nejad A. Drug-target interaction prediction from PSSM based evolutionary information. J Pharmacol Toxicol Methods 2016; 78: 42-51.
[http://dx.doi.org/10.1016/j.vascn.2015.11.002] [PMID: 26592807]
[21]
Wang L, You ZH, Li LP, Yan X, Zhang W. Incorporating chemical sub-structures and protein evolutionary information for inferring drug-target interactions. Sci Rep 2020; 10(1): 6641.
[http://dx.doi.org/10.1038/s41598-020-62891-2] [PMID: 32313024]
[22]
Ezzat A, Zhao P, Wu M, Li XL, Kwoh CK. Drug-target interaction prediction with graph regularized matrix factorization. IEEE/ACM Trans Comput Biol Bioinformatics 2017; 14(3): 646-56.
[http://dx.doi.org/10.1109/TCBB.2016.2530062] [PMID: 26890921]
[23]
Ezzat A, Wu M, Li X-L, Kwoh CK. Drug-target interaction prediction using ensemble learning and dimensionality reduction. Methods 2017; 129: 81-8.
[http://dx.doi.org/10.1016/j.ymeth.2017.05.016] [PMID: 28549952]
[24]
Meng F-R, You Z-H, Chen X, Zhou Y, An J-Y. Prediction of Drug-Target Interaction Networks from the integration of protein sequences and drug chemical structures. Molecules 2017; 22(7): 1119.
[http://dx.doi.org/10.3390/molecules22071119] [PMID: 28678206]
[25]
Rayhan F, Ahmed S, Shatabda S, et al. iDTI-ESBoost: Identification of drug target interaction using evolutionary and structural features with boosting. Sci Rep 2017; 7(1): 17731.
[http://dx.doi.org/10.1038/s41598-017-18025-2] [PMID: 29255285]
[26]
Mahmud SMH, Chen W, Meng H, Jahan H, Liu Y, Hasan SMM. Prediction of drug-target interaction based on protein features using undersampling and feature selection techniques with boosting. Anal Biochem 2020; 589113507.
[http://dx.doi.org/10.1016/j.ab.2019.113507] [PMID: 31734254]
[27]
Hasan Mahmud SM, Chen W, Jahan H, Dai B, Din SU, Dzisoo AM. DeepACTION: A deep learning-based method for predicting novel drug-target interactions. Anal Biochem 2020; 610113978.
[http://dx.doi.org/10.1016/j.ab.2020.113978] [PMID: 33035462]
[28]
Chen T, Guestrin C. Xgboost: A scalable tree boosting system Proceedings of the 22nd acm sigkdd international conference on knowledge discoveryand data mining. 785-94.
[http://dx.doi.org/10.1145/2939672.2939785]
[29]
Zhong J, Sun Y, Peng W, Xie M, Yang J, Tang X. XGBFEMF: An XGBoost-based framework for essential protein prediction. In: IEEE Transactions on NanoBioscience. 2018; 17: pp. (3)243-50.
[http://dx.doi.org/10.1109/TNB.2018.2842219]
[30]
Schomburg Ida, Chang Antje, Ebeling Christian, et al. BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Research 2004; 32(suppl_ 1): D431-3.
[http://dx.doi.org/10.1093/nar/gkh081]
[31]
Kanehisa Minoru, Araki Michihiro, Goto Susumu, et al. KEGG for linking genomes to life and the environment. Nucleic Acids Research 2008; 36(suppl_ 1): D480-4.
[http://dx.doi.org/10.1093/nar/gkm882]
[32]
Günther Stefan, Kuhn Michael, Dunkel Mathias, et al. SuperTarget and Matador: Resources for exploring drug-target relationships. Nucleic Acids Research 2008; 36(suppl_ 1): D919-22.
[http://dx.doi.org/10.1093/nar/gkm862]
[33]
Knox Craig, Law Vivian, Jewison Timothy, et al. DrugBank 3.0: A comprehensive resource for ‘Omics’ research on drugs. Nucleic Acids Res 2011; 39(suppl_ 1): D1035-41.
[http://dx.doi.org/10.1093/nar/gkq1126]
[34]
Dehzangi A, López Y, Lal SP, et al. PSSM-Suc: Accurately predicting succinylation using position specific scoring matrix into bigram for feature extraction. J Theor Biol 2017; 425: 97-102.
[http://dx.doi.org/10.1016/j.jtbi.2017.05.005] [PMID: 28483566]
[35]
Esna Ashari Z, Dasgupta N, Brayton KA, Broschat SL. An optimal set of features for predicting type IV secretion system effector proteins for a subset of species based on a multi-level feature selection approach. PLoS One 2018; 13(5): e0197041.
[http://dx.doi.org/10.1371/journal.pone.0197041] [PMID: 29742157]
[36]
Shi H, Liu S, Chen J, Li X, Ma Q, Yu B. Predicting drug-target interactions using Lasso with random forest based on evolutionary information and chemical structure. Genomics 2019; 111(6): 1839-52.
[http://dx.doi.org/10.1016/j.ygeno.2018.12.007] [PMID: 30550813]
[37]
Wang J, Yang B, Revote J, et al. POSSUM: a bioinformatics toolkit for generating numerical sequence feature descriptors based on PSSM profiles. Bioinformatics 2017; 33(17): 2756-8.
[http://dx.doi.org/10.1093/bioinformatics/btx302] [PMID: 28903538]
[38]
Paliwal KK, Sharma A, Lyons J, Dehzangi A. A tri-gram based feature extraction technique using linear probabilities of position specific scoring matrix for protein fold recognition. IEEE Trans Nanobioscience 2014; 13(1): 44-50.
[http://dx.doi.org/10.1109/TNB.2013.2296050] [PMID: 24594513]
[39]
Altschul SF, Madden TL, Schäffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997; 25(17): 3389-402.
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694]
[40]
Liu T, Zheng X, Wang J. Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile. Biochimie 2010; 92(10): 1330-4.
[http://dx.doi.org/10.1016/j.biochi.2010.06.013] [PMID: 20600567]
[41]
Waris M, Ahmad K, Kabir M, Hayat M. Identification of DNA binding proteins using evolutionary profiles position specific scoring ma-trix. Neurocomputing 2016; 199: 154-62.
[http://dx.doi.org/10.1016/j.neucom.2016.03.025]
[42]
Kabir M, Ahmad S, Iqbal M, Zar NKS, Liu Z, Yu D-J. Improving prediction of extracellular matrix proteins using evolutionary information via a grey system model and asymmetric under-sampling technique. Chemom Intell Lab Syst 2018; 174: 22-32.
[http://dx.doi.org/10.1016/j.chemolab.2018.01.004]
[43]
Taherzadeh G, Zhou Y, Liew AW-C, Yang Y. Sequence-based prediction of protein-carbohydrate binding sites using supportvector machines. J Chem Inf Model 2016; 56(10): 2115-22.
[http://dx.doi.org/10.1021/acs.jcim.6b00320] [PMID: 27623166]
[44]
Khan M, Hayat M, Khan SA, Iqbal N. Unb-DPC: Identify mycobacterial membrane protein types by incorporating un-biased dipeptide composition into Chou’s general PseAAC. J Theor Biol 2017; 415: 13-9.
[http://dx.doi.org/10.1016/j.jtbi.2016.12.004] [PMID: 27939596]
[45]
Javed F, Hayat M. Predicting subcellular localization of multi-label proteins by incorporating the sequence features into Chou’s PseAAC. Genomics 2019; 111(6): 1325-32.
[http://dx.doi.org/10.1016/j.ygeno.2018.09.004] [PMID: 30196077]
[46]
Appati JK, Denwar IW, Owusu E, Soli MA. Construction of an ensemble scheme for stock price prediction using deep learning techniques. Int J Intell Inf Technol 2021; 17(2): 72-95.
[http://dx.doi.org/10.4018/IJIIT.2021040104]
[47]
Babajide Mustapha I, Saeed F. Bioactive molecule prediction using extreme gradient boosting. Molecules 2016; 21(8): 983.
[http://dx.doi.org/10.3390/molecules21080983] [PMID: 27483216]
[48]
Clottey RN, Yaokumah W, Appati JK. Modelling and evaluation of network intrusion detection systems using machine learning techniques. Int J Intell Inf Technol 2021; 17(4): 1-19.
[http://dx.doi.org/10.4018/IJIIT.289971]
[49]
Adel A, Farid A. Performance evaluation of machine learning for recognizing human facial emotions. Int J Intell Inf Technol 2021; 17(3): 63-79.
[http://dx.doi.org/10.4018/IJIIT.2021070105]
[50]
Friedman JH. Greedy function approximation: A gradient boosting machine. Ann Stat 2001; 29(5): 1189-232. [http://www.jstor.org/stable/2699986
[http://dx.doi.org/10.1214/aos/1013203451]
[51]
Kumari P, Nath A, Chaube R. Identification of human drug targets using machine-learning algorithms. Comput Biol Med 2015; 56: 175-81.
[http://dx.doi.org/10.1016/j.compbiomed.2014.11.008] [PMID: 25437231]
[52]
Chen Xing, Zhou Chi, Wang Chun-Chun, Zhao Yan. Predicting potential small molecule–miRNA associations based on bounded nuclear norm regularization. Briefings in Bioinformatics 2021; 22(6): bbab328.
[http://dx.doi.org/10.1093/bib/bbab328] [PMID: 34404088]