SIMEON: Prediction of Chemical-protein Interaction via Stacked
Bi-GRU-normalization Network and External Biomedical Knowledge

Xiaolei      Ma; Yang      Lu; Yinan      Lu; Mingyang      Jiang

Abstract

Background: Chemical compounds and proteins/genes are an important class of entities in biomedical research, and their interactions play a key role in precision medicine, drug discovery, basic clinical research, and building knowledge bases. Many computational methods have been proposed to identify chemical–protein interactions. However, the majority of these proposed models cannot model long-distance dependencies between chemical and protein, and the neural networks used to suffer from gradient descent, with little taking into account the characteristics of the chemical structure characteristics of the compound.

Methods: To address the above limitations, we propose a novel model, SIMEON, to identify chemical– protein interactions. First, an input sequence is represented with pre-trained language model and an attention mechanism is used to uncover contribution degree of different words to entity relations and potential semantic information. Secondly, key features are extracted by a multi-layer stacked Bidirectional Gated Recurrent Units (Bi-GRU)-normalization residual network module to resolve higherorder dependencies while overcoming network degradation. Finally, the representation is introduced to be enhanced by external knowledge regarding the chemical structure characteristics of the compound external knowledge.

Results: Excellent experimental results show that our stacked integration model combines the advantages of Bi-GRU, normalization methods, and external knowledge to improve the performance of the model by complementing each other.

Conclusion: Our proposed model shows good performance in chemical-protein interaction extraction, and it can be used as a useful complement to biological experiments to identify chemical-protein interactions.

Keywords: Chemical–protein interaction, normalization methods, stacked integration model, Bi-GRU, molecular and protein representation, biomedical text.

Graphical Abstract

[1]
Kringelum J, Kjaerulff SK, Brunak S, et al. ChemProt-3.0: A global chemical biology diseases mapping. Database (Oxford)  2016; 2016: 1-7.
 [http://dx.doi.org/10.1093/database/bav123] [PMID: 26876982]

[2]
Wu PY, Cheng CW, Kaddi CD, et al. Omic and electronic health record big data analytics for precision medicine. IEEE Trans Biomed Eng  2017; 64(2): 263-73.
 [http://dx.doi.org/10.1109/TBME.2016.2573285] [PMID: 27740470]

[3]
Krallinger M, Rabal O, Akhondi SA, et al. Overview of the BioCreative VI chemical-protein interaction track. Proceedings of the sixth BioCreative challenge evaluation workshop.  Bethesda, MD, USA. 2017; pp. 141-6.

[4]
Khan M, Reza MQ, Salhan AK, et al. Classification of oils by ECOC based multi-class SVM using spectral analysis of acoustic signals. Appl Acoust  2021; 183(3): 108273.
 [http://dx.doi.org/10.1016/j.apacoust.2021.108273]

[5]
Villemin JP, Lorenzi C, Cabrillac MS, et al. A cell-to-patient machine learning transfer approach uncovers novel basal-like breast cancer prognostic markers amongst alternative splice variants. BMC Biol  2021; 19(1): 1-19.
 [http://dx.doi.org/10.1186/s12915-021-01002-7] [PMID: 33845831]

[6]
Gaur NK, Goyal VD, Kulkarni K, et al. Machine learning classifiers aid virtual screening for efficient design of mini-protein therapeutics. Bioorg Med Chem Lett  2021; 38: 127852.
 [http://dx.doi.org/10.1016/j.bmcl.2021.127852] [PMID: 33609660]

[7]
Warikoo N, Chang YC, Hsu WL. LPTK: A linguistic pattern-aware dependency tree kernel approach for the BioCreative VI CHEMPROT task. Database (Oxford)  2018; 2018: 1-21.
 [http://dx.doi.org/10.1093/database/bay108] [PMID: 30346607]

[8]
Pei-Yau L, Zhe H, Tingting Z, et al. Extracting chemical–protein interactions from literature using sentence structure analysis and feature engineering. Database (Oxford)  2019; 1-8.

[9]
Prifti E, Fall A, Davogustto G, et al. Deep learning analysis of electrocardiogram for risk prediction of drug-induced arrhythmias and diagnosis of long QT syndrome. Eur Heart J  2021; 42(38): 3948-61.
 [http://dx.doi.org/10.1093/eurheartj/ehab588] [PMID: 34468739]

[10]
Zhang Y, Xu K, Yang L, et al. Chemical-protein interaction extraction from biomedical literature: A hierarchical recurrent convolutional neural network method. Int J Data Min Bioinform  2019; 22(2): 113-30.
 [http://dx.doi.org/10.1504/IJDMB.2019.10021458]

[11]
Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw  2005; 18(5-6): 602-10.
 [http://dx.doi.org/10.1016/j.neunet.2005.06.042] [PMID: 16112549]

[12]
Schuster M, Paliwal KK. Bidirectional recurrent neural networks. IEEE Trans Signal Process  1997; 45(11): 2673-81.
 [http://dx.doi.org/10.1109/78.650093]

[13]
Wang E, Wang F, Yang Z, et al. A graph convolutional network–based method for chemical-protein interaction extraction: Algorithm development. JMIR Med Inform  2020; 8(5): 1-12.
 [http://dx.doi.org/10.2196/17643] [PMID: 32348257]

[14]
Zitnik M, Agrawal M, Leskovec J. Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics  2018; 34(13): i457-66.
 [http://dx.doi.org/10.1093/bioinformatics/bty294] [PMID: 29949996]

[15]
Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering.  In: 30th Conference on Neural Information Processing Systems (NIPS 2016); Barcelona, Spain. 2016.

[16]
Zhong J, Wang J, Peng W, et al. A feature selection method for prediction essential protein. Tsinghua Sci Technol  2015; 20(5): 491-9.
 [http://dx.doi.org/10.1109/TST.2015.7297748]

[17]
Mehryary F, Björne J, Salakoski T, et al. Combining support vector machines and LSTM networks for chemical–protein relation extraction. Proceedings of the BioCreative VI Workshop.  1: pp. 176-80.

[18]
Vapnik VN. An overview of statistical learning theory. IEEE Trans Neural Netw  1999; 10(5): 988-99.
 [http://dx.doi.org/10.1109/72.788640] [PMID: 18252602]

[19]
Zheng S, Jiang AN, Yang XR, et al. A new reliability rock mass classification method based on least squares support vector machine optimized by bacterial foraging optimization algorithm. Adv Civ Eng  2020; 2020(1): 1-13.
 [http://dx.doi.org/10.1155/2020/8887088]

[20]
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput  1997; 9(8): 1735-80.
 [http://dx.doi.org/10.1162/neco.1997.9.8.1735] [PMID: 9377276]

[21]
Peng Y, Rios A, Kavuluru R, et al. Extracting chemical-protein relations with ensembles of SVM and deep learning models. Database (Oxford)  2018; 2018: 1-9.
 [http://dx.doi.org/10.1093/database/bay073] [PMID: 30020437]

[22]
Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 07-12 June 2015; Boston, MA: IEEE 2015.

[23]
Kim Y. Convolutional neural networks for sentence classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).  October  2014; Doha, Qatar. pp. 1746-.
 [http://dx.doi.org/10.3115/v1/D14-1181]

[24]
Peters M, Neumann M, Iyyer M, et al. Deep contextualized word representations. Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics.   In: Human language Technologies; 20018; pp. 2227-.

[25]
Ethayarajh K. How contextual are contextualized word representations? Comparing the geometry of BERT, ELMo, and GPT-2 embeddings. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).  2019; pp. 55-65.
 [http://dx.doi.org/10.18653/v1/D19-1006]

[26]
Devlin J, Chang M, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.  2019; pp. 4171-.

[27]
Alfattni G, Belousov M, Peek N, Nenadic G. Extracting drug names and associated attributes from discharge summaries: Text mining study. JMIR Med Inform  2021; 9(5): 1-17.
 [http://dx.doi.org/10.2196/24678] [PMID: 33949962]

[28]
Peng Y, Chen Q, Lu Z. An empirical study of Multi-Task learning on BERT for biomedical text mining. Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing.  2020; pp. 205-14.
 [http://dx.doi.org/10.18653/v1/2020.bionlp-1.22]

[29]
Choi D, Lee H. Extracting chemical-protein interactions via calibrated deep neural network and self-training. Findings of the Association for Computational Linguistics: EMNLP  2020; 2020: pp. 2086-.
 [http://dx.doi.org/10.18653/v1/2020.findings-emnlp.189]

[30]
Sun C, Yang Z, Wang L, et al. Attention guided capsule networks for chemical-protein interaction extraction. J Biomed Inform  2020; 103: 1-21.
 [http://dx.doi.org/10.1016/j.jbi.2020.103392] [PMID: 32068034]

[31]
Beltagy I, Lo K, Cohan A. SciBERT: A pretrained language model for scientific text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3615-3620. Hong Kong, China. 
 [http://dx.doi.org/10.18653/v1/D19-1371]

[32]
Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. 3rd International Conference on Learning Representations, ICLR 2015. San Diego, United States. 2015.

[33]
Li JJ, Luong M, Jurafsky D. A hierarchical neural autoencoder for paragraphs and documents. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.  2015; pp. 1106-.
 [http://dx.doi.org/10.3115/v1/P15-1107]

[34]
Shaw P, Uszkoreit J, Vaswani A. Self-attention with relative position representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.  2018; pp. 464-8.

[35]
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).  2016; pp. 770-778.

[36]
Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proceedings of the 32nd International Conference on Machine Learning, PMLR 37.  pp. 448-456.2015; 

[37]
Xu J, Sun X, Zhang Z, et al. Understanding and improving layer normalization. Proceedings of the 33rd International Conference on Neural Information Processing Systems.  2015; pp. 4381-.

[38]
Nam H, Kim H. Batch-instance normalization for adaptively style-invariant neural networks. Proceedings of the 32nd International Conference on Neural Information Processing Systems.  2018; 2563-72.

[39]
Wu Y, He K. Group normalization. Proceedings of the European Conference on Computer Vision (ECCV).  2020; 128(3): 742-55.

[40]
Tsubaki M, Tomii K, Sese J. Compound–protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics  2019; 35(2): 309-18.
 [http://dx.doi.org/10.1093/bioinformatics/bty535] [PMID: 29982330]

[41]
Sun C, Yang Z, Su L, et al. Chemical–protein interaction extraction via Gaussian probability distribution and external biomedical knowledge. Bioinformatics  2020; 36(15): 4323-30.
 [http://dx.doi.org/10.1093/bioinformatics/btaa491] [PMID: 32399565]

[42]
Peng Y, Yan S, Lu Z. Transfer learning in biomedical natural language processing: An evaluation of BERT and ELMo on ten benchmarking datasets. Proceedings of the 18th BioNLP Workshop and Shared Task.  Florence, Italy. 2019; pp. 58-65.
 [http://dx.doi.org/10.18653/v1/W19-5006]

[43]
María Herrero-Zazo. Isabel Segura-Bedmar, Paloma Martínez, Thierry Declerck. The DDI corpus: An annotated corpus with pharmacological substances and drug–drug interactions. J Biomed Inform  2013; 46(5): 914-20.
 [http://dx.doi.org/10.1016/j.jbi.2013.07.011] [PMID: 23906817]

[44]
Chowdhury MFM, Lavelli A. FBK-irst: A multi-phase kernel based approach for drug-drug interaction detection and classification that exploits linguistic information. In Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), pages 351–355. Atlanta, Georgia, USA. 

[45]
Liu S, Tang B, Chen Q, et al. Drug-drug interaction extraction via convolutional neural networks. Comput Math Methods Med  2016; 2016: 1-8.
 [http://dx.doi.org/10.1155/2016/6918381] [PMID: 26941831]

[46]
Zheng W, Lin H, Luo L, et al. An attention-based effective neural model for drug-drug interactions extraction. BMC Bioinformatics  2017; 18(1): 1-11.
 [http://dx.doi.org/10.1186/s12859-017-1855-x] [PMID: 29017459]

[47]
Asada M, Miwa M, Sasaki Y. Using drug descriptions and molecular structures for drug–drug interaction extraction from literature. Bioinformatics  2021; 37(12): 1739-46.
 [http://dx.doi.org/10.1093/bioinformatics/btaa907] [PMID: 33098410]

Cite As

Current Bioinformatics

SIMEON: Prediction of Chemical-protein Interaction via Stacked Bi-GRU-normalization Network and External Biomedical Knowledge

Abstract

Graphical Abstract