Transformer-based Named Entity Recognition for Clinical Cancer Drug Toxicity by Positive-unlabeled Learning and KL Regularizers

Weixin      Xie; Jiayu      Xu; Chengkui      Zhao; Jin      Li; Shuangze      Han; Tianyu      Shao; Limei      Wang; Weixing      Feng

Abstract

Background: With increasing rates of polypharmacy, the vigilant surveillance of clinical drug toxicity has emerged as an important concern. Named Entity Recognition (NER) stands as an indispensable undertaking, essential for the extraction of valuable insights regarding drug safety from the biomedical literature. In recent years, significant advancements have been achieved in the deep learning models on NER tasks. Nonetheless, the effectiveness of these NER techniques relies on the availability of substantial volumes of annotated data, which is labor-intensive and inefficient.

Methods: This study introduces a novel approach that diverges from the conventional reliance on manually annotated data. It employs a transformer-based technique known as Positive-Unlabeled Learning (PULearning), which incorporates adaptive learning and is applied to the clinical cancer drug toxicity corpus. To improve the precision of prediction, we employ relative position embeddings within the transformer encoder. Additionally, we formulate a composite loss function that integrates two Kullback-Leibler (KL) regularizers to align with PULearning assumptions. The outcomes demonstrate that our approach attains the targeted performance for NER tasks, solely relying on unlabeled data and named entity dictionaries.

Conclusion: Our model achieves an overall NER performance with an F1 of 0.819. Specifically, it attains F1 of 0.841, 0.801 and 0.815 for DRUG, CANCER, and TOXI entities, respectively. A comprehensive analysis of the results validates the effectiveness of our approach in comparison to existing PULearning methods on biomedical NER tasks. Additionally, a visualization of the associations among three identified entities is provided, offering a valuable reference for querying their interrelationships.

Keywords: KL regularizers, clinical drug toxicity, named entity recognition (NER), positive-unlabeled learning (PULearning), adaptive sampling, cancer drug.

Graphical Abstract

[1]
Kovačević M, Vezmar Kovačević S, Radovanović S, Stevanović P, Miljković B. Adverse drug reactions caused by drug–drug interactions in cardiovascular disease patients: Introduction of a simple prediction tool using electronic screening database items. Curr Med Res Opin  2019; 35(11): 1873-83.
 [http://dx.doi.org/10.1080/03007995.2019.1647021] [PMID: 31328967]

[2]
Létinier L, Ferreira A, Marceron A, et al. Spontaneous reports of serious adverse drug reactions resulting from drug–drug interactions: An analysis from the french pharmacovigilance database. Front Pharmacol  2021; 11: 624562.
 [http://dx.doi.org/10.3389/fphar.2020.624562] [PMID: 33841134]

[3]
Magro L, Arzenton E, Leone R, et al. Identifying and characterizing serious adverse drug reactions associated with drug-drug interactions in a spontaneous reporting database. Front Pharmacol  2021; 11: 622862.
 [http://dx.doi.org/10.3389/fphar.2020.622862] [PMID: 33536925]

[4]
Tatonetti NP, Fernald GH, Altman RB. A novel signal detection algorithm for identifying hidden drug-drug interactions in adverse event reports. J Am Med Inform Assoc  2012; 19(1): 79-85.
 [http://dx.doi.org/10.1136/amiajnl-2011-000214] [PMID: 21676938]

[5]
Murphy CC, Fullington HM, Alvarez CA, et al. Polypharmacy and patterns of prescription medication use among cancer survivors. Cancer  2018; 124(13): 2850-7.
 [http://dx.doi.org/10.1002/cncr.31389] [PMID: 29645083]

[6]
Ramsdale E, Mohamed M, Yu V, et al. Polypharmacy, potentially inappropriate medications, and drug-drug interactions in vulnerable older adults with advanced cancer initiating cancer treatment. Oncologist  2022; 27(7): e580-8.
 [http://dx.doi.org/10.1093/oncolo/oyac053] [PMID: 35348764]

[7]
Zhang C, Lu Y, Zang T. CNN-DDI: A learning-based method for predicting drug-drug interactions using convolution neural networks. BMC Bioinformatics  2022; 23(S1): 88.
 [http://dx.doi.org/10.1186/s12859-022-04612-2] [PMID: 35255808]

[8]
Huang D, Jiang Z, Zou L, Li L. Drug-drug interaction extraction from biomedical literature using support vector machine and long short term memory networks. Inf Sci  2017; 415-416: 100-9.
 [http://dx.doi.org/10.1016/j.ins.2017.06.021]

[9]
Basile AO, Yahi A, Tatonetti NP. Artificial intelligence for drug toxicity and safety. Trends Pharmacol Sci  2019; 40(9): 624-35.
 [http://dx.doi.org/10.1016/j.tips.2019.07.005] [PMID: 31383376]

[10]
Liu F, Jagannatha A, Yu H. Towards drug safety surveillance and pharmacovigilance: Current progress in detecting medication and adverse drug events from electronic health records. Drug Saf  2019; 42(1): 95-7.
 [http://dx.doi.org/10.1007/s40264-018-0766-8] [PMID: 30649734]

[11]
Shukkoor MSA, Raja K, Baharuldin MTH. A text mining protocol for predicting drug-drug interaction and adverse drug reactions from pubmed articles. Methods Mol Biol  2022; 2496: 237-58.
 [http://dx.doi.org/10.1007/978-1-0716-2305-3_13] [PMID: 35713868]

[12]
Harpaz R, Callahan A, Tamang S, et al. Text mining for adverse drug events: The promise, challenges, and state of the art. Drug Saf  2014; 37(10): 777-90.
 [http://dx.doi.org/10.1007/s40264-014-0218-z] [PMID: 25151493]

[13]
Zhao S, Su C, Lu Z, Wang F. Recent advances in biomedical literature mining. Brief Bioinform  2021; 22(3): bbaa057.
 [http://dx.doi.org/10.1093/bib/bbaa057] [PMID: 32422651]

[14]
XU DJ. The application of text mining in social science research: Present situation, problems and prospects. Sci Soc  2015; 5(3): 75-89.

[15]
Li J, Sun A, Han J, Li C. A survey on deep learning for named entity recognition. IEEE Trans Knowl Data Eng  2022; 34(1): 50-70.
 [http://dx.doi.org/10.1109/TKDE.2020.2981314]

[16]
Goulart RRV, Strube de Lima VL, Xavier CC. A systematic review of named entity recognition in biomedical texts. J Braz Comput Soc  2011; 17(2): 103-16.
 [http://dx.doi.org/10.1007/s13173-011-0031-9]

[17]
Habibi M, Weber L, Neves M, Wiegandt DL, Leser U. Deep learning with word embeddings improves biomedical named entity recognition. Bioinformatics  2017; 33(14): i37-48.
 [http://dx.doi.org/10.1093/bioinformatics/btx228] [PMID: 28881963]

[18]
Luo L, Yang Z, Yang P, et al. An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition. Bioinformatics  2018; 34(8): 1381-8.
 [http://dx.doi.org/10.1093/bioinformatics/btx761] [PMID: 29186323]

[19]
Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv  2018; 2018: 04805.

[20]
Lee J, Yoon W, Kim S, et al. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics  2020; 36(4): 1234-40.
 [http://dx.doi.org/10.1093/bioinformatics/btz682] [PMID: 31501885]

[21]
Fu S, Chen D, He H, et al. Clinical concept extraction: A methodology review. J Biomed Inform  2020; 109: 103526.
 [http://dx.doi.org/10.1016/j.jbi.2020.103526] [PMID: 32768446]

[22]
Khan MR, Ziyadi M. Mt-bioner: Multi-task learning for biomedical named entity recognition using deep bidirectional transformers. arXiv  2020; 2020: 08904.

[23]
Lison P, Hubin A, Barnes J. Named entity recognition without labelled data: A weak supervision approach. arXiv  2020; 2020: 14723.
 [http://dx.doi.org/10.18653/v1/2020.acl-main.139]

[24]
Liang C, Yu Y, Jiang H. Bond: Bert-assisted open-domain named entity recognition with distant supervision. Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining.  1054-64.

[25]
Mayhew S, Chaturvedi S, Tsai CT. Named entity recognition with partially annotated training data. arXiv  2019; 2019: 09270.
 [http://dx.doi.org/10.18653/v1/K19-1060]

[26]
Yang Y, Chen W, Li Z. Distantly supervised NER with partial annotation learning and reinforcement learning. Proceedings of the 27th International Conference on Computational Linguistics.  2159-69.

[27]
Yang P, Liu W, Yang J. Positive unlabeled learning via wrapperbased adaptive sampling. IJCAI  2017; 3273-9.
 [http://dx.doi.org/10.24963/ijcai.2017/457]

[28]
Shang J, Liu L, Ren X. Learning named entity tagger using domain-specific dictionary. arXiv  2018; 2018: 03599.
 [http://dx.doi.org/10.18653/v1/D18-1230]

[29]
Peng M, Xing X, Zhang Q. Distantly supervised named entity recognition using positive-unlabeled learning. arXiv  2019; 2019: 01378.
 [http://dx.doi.org/10.18653/v1/P19-1231]

[30]
Lindemann EA, Chen ES, Rajamani S, Manohar N, Wang Y, Melton GB. Assessing the representation of occupation information in free-text clinical documents across multiple sources. Stud Health Technol Inform  2017; 245: 486-90.
 [PMID: 29295142]

[31]
van Leeuwen RWF, Jansman FGA, van den Bemt PMLA, et al. Drug-drug interactions in patients treated for cancer: A prospective study on clinical interventions. Ann Oncol  2015; 26(5): 992-7.
 [http://dx.doi.org/10.1093/annonc/mdv029] [PMID: 25628444]

[32]
Slocum M, Garcia SF, McKoy JM. Cancer drug toxicity: Moving from patient to survivor. Cancer Treat Res  2019; 171: 107-18.
 [http://dx.doi.org/10.1007/978-3-319-43896-2_8] [PMID: 30552660]

[33]
Prado CMM, Antoun S, Sawyer MB, Baracos VE. Two faces of drug therapy in cancer: Drug-related lean tissue loss and its adverse consequences to survival and toxicity. Curr Opin Clin Nutr Metab Care  2011; 14(3): 250-4.
 [http://dx.doi.org/10.1097/MCO.0b013e3283455d45] [PMID: 21415735]

[34]
Kavuluru R, Rios A, Tran T. Extracting drug-drug interactions with word and character-level recurrent neural networks. IEEE Int Conf Healthc Inform  2017; 2017: 5-12.
 [http://dx.doi.org/10.1109/ICHI.2017.15] [PMID: 29034375]

[35]
Wei J, Hu T, Dai J, Wang Z, Han P, Huang W. Research on named entity recognition of adverse drug reactions based on NLP and deep learning. Front Pharmacol  2023; 14: 1121796.
 [http://dx.doi.org/10.3389/fphar.2023.1121796] [PMID: 37332351]

[36]
Jagannatha A, Liu F, Liu W, Yu H. Overview of the first natural language processing challenge for extracting medication, indication, and adverse drug events from electronic health record notes (MADE 1.0). Drug Saf  2019; 42(1): 99-111.
 [http://dx.doi.org/10.1007/s40264-018-0762-z] [PMID: 30649735]

[37]
Karimi S, Metke-Jimenez A, Kemp M, Wang C. Cadec: A corpus of adverse drug event annotations. J Biomed Inform  2015; 55: 73-81.
 [http://dx.doi.org/10.1016/j.jbi.2015.03.010] [PMID: 25817970]

[38]
Oronoz M, Gojenola K, Pérez A, de Ilarraza AD, Casillas A. On the creation of a clinical gold standard corpus in Spanish: Mining adverse drug reactions. J Biomed Inform  2015; 56: 318-32.
 [http://dx.doi.org/10.1016/j.jbi.2015.06.016] [PMID: 26141794]

[39]
Henry S, Buchan K, Filannino M, Stubbs A, Uzuner O. 2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records. J Am Med Inform Assoc  2020; 27(1): 3-12.
 [http://dx.doi.org/10.1093/jamia/ocz166] [PMID: 31584655]

[40]
Xie W, Wang L, Cheng Q, et al. Integrated random negative sampling and uncertainty sampling in active learning improve clinical drug safety drug–drug interaction information retrieval. Front Pharmacol  2021; 11: 582470.
 [http://dx.doi.org/10.3389/fphar.2020.582470] [PMID: 34017245]

[41]
Xie W, Fan K, Zhang S, Li L. Multiple sampling schemes and deep learning improve active learning performance in drug-drug interaction information retrieval analysis from the literature. J Biomed Semantics  2023; 14(1): 5.
 [http://dx.doi.org/10.1186/s13326-023-00287-7] [PMID: 37248476]

[42]
Loukachevitch N, Manandhar S, Baral E, et al. NEREL-BIO: A dataset of biomedical abstracts annotated with nested named entities. Bioinformatics  2023; 39(4): btad161.
 [http://dx.doi.org/10.1093/bioinformatics/btad161] [PMID: 37004189]

[43]
Drugs and their names. Drug Ther Bull  2018; 56(3): 33-6.
 [http://dx.doi.org/10.1136/dtb.2018.3.0602] [PMID: 29545265]

[44]
What’s in a name? Lessening the confusion over drug names. Home Care Provid  2000; 5(2): 62-6.
 [PMID: 10835153]

[45]
Hausman DM. What is cancer? Perspect Biol Med  2019; 62(4): 778-84.
 [http://dx.doi.org/10.1353/pbm.2019.0046] [PMID: 31761807]

[46]
Mokhtari-Hessari P, Montazeri A. Health-related quality of life in breast cancer patients: Review of reviews from 2008 to 2018. Health Qual Life Outcomes  2020; 18(1): 338.
 [http://dx.doi.org/10.1186/s12955-020-01591-x] [PMID: 33046106]

[47]
Stout NL, Baima J, Swisher AK, Winters-Stone KM, Welsh J. A systematic review of exercise systematic reviews in the cancer literature (2005-2017). PM R  2017; 9(S2): 347-8.

[48]
Chhabra N, Kennedy J. A review of cancer immunotherapy toxicity: Immune checkpoint inhibitors. J Med Toxicol  2021; 17(4): 411-24.
 [http://dx.doi.org/10.1007/s13181-021-00833-8] [PMID: 33826117]

[49]
Silakari O, Singh PK. Concepts and experimental protocols of modelling and informatics in drug design. Academic Press 2020.

[50]
Artstein R. Inter-annotator agreement. In: Handbook of linguistic annotation. Dordrecht: Springer 2017.
 [http://dx.doi.org/10.1007/978-94-024-0881-2_11]

[51]
Zhou K, Li Y, Li Q. Distantly supervised named entity recognition via confidence-based multi-class positive and unlabeled learning. arXiv  2020; 2020: 09589.

[52]
Du Plessis MC, Niu G, Sugiyama M. Analysis of learning from positive and unlabeled data. Adv Neural Inf Process Syst  2014; 27(27): 703-11.

[53]
Li XL, Liu B. Learning from positive and unlabeled examples with different data distributions. ECML 2005: 16th European Conference on Machine Learning.  Porto, Portugal. 2005; pp. Oct 3-7; 218-29.
 [http://dx.doi.org/10.1007/11564096_24]

[54]
Elkan C, Noto K. Learning classifiers from only positive and unlabeled data. Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining.  213-20.
 [http://dx.doi.org/10.1145/1401890.1401920]

[55]
Li W, Guo Q, Elkan C. A positive and unlabeled learning algorithm for one-class classification of remote-sensing data. IEEE Trans Geosci Remote Sens  2011; 49(2): 717-25.
 [http://dx.doi.org/10.1109/TGRS.2010.2058578]

[56]
Vaswani A, Shazeer N, Parmar N. Attention is all you need. Adv Neural Inf Process Syst  2017; 30: 5998-6008.

[57]
Zhang S, Fan R, Liu Y, Chen S, Liu Q, Zeng W. Applications of transformer-based language models in bioinformatics: A survey. Bioinform Adv  2023; 3(1): vbad001.

[58]
Kalyan KS, Rajasekharan A, Sangeetha S. AMMU: A survey of transformer-based biomedical pretrained language models. J Biomed Inform  2022; 126: 103982.
 [http://dx.doi.org/10.1016/j.jbi.2021.103982] [PMID: 34974190]

[59]
Huang Z, Liang D, Xu P. Improve transformer models with better relative position embeddings. arXiv  2020; 2020: 13658.
 [http://dx.doi.org/10.18653/v1/2020.findings-emnlp.298]

[60]
Dufter P, Schmitt M, Schütze H. Position information in transformers: An overview. Comput Linguist  2022; 48(3): 733-63.
 [http://dx.doi.org/10.1162/coli_a_00445]

[61]
Zhou D, Miao L, He Y. Position-aware deep multi-task learning for drug-drug interaction extraction. Artif Intell Med  2018; 87: 1-8.
 [http://dx.doi.org/10.1016/j.artmed.2018.03.001] [PMID: 29559249]

[62]
Yan H, Deng B, Li X. TENER: Adapting transformer encoder for named entity recognition. arXiv  2019; 2019: 1911-04474.

[63]
Bekker J, Davis J. Estimating the class prior in positive and unlabeled data through decision tree induction. Proc Conf AAAI Artif Intell  2018; 32(1)
 [http://dx.doi.org/10.1609/aaai.v32i1.11715]

[64]
Liu X, Yu HF, Dhillon I. Learning to encode position for transformer with continuous dynamical model. Int Conf Mach LearnPMLR  2020; 6327-35.

[65]
Press O, Smith NA, Lewis M. Shortformer: Better language modeling using shorter inputs. arXiv  2020; 2020-15932.

[66]
Yang P, Ormerod JT, Liu W, Ma C, Zomaya AY, Yang JYH. Adasampling for positive-unlabeled and label noise learning with bioinformatics applications. IEEE Trans Cybern  2019; 49(5): 1932-43.
 [http://dx.doi.org/10.1109/TCYB.2018.2816984] [PMID: 29993676]

[67]
Stolfi P, Mastropietro A, Pasculli G, Tieri P, Vergni D. NIAPU: network-informed adaptive positive-unlabeled learning for disease gene identification. Bioinformatics  2023; 39(2): btac848.
 [http://dx.doi.org/10.1093/bioinformatics/btac848] [PMID: 36727493]

[68]
Zeng J, Kruger U, Geluk J, Wang X, Xie L. Detecting abnormal situations using the Kullback-Leibler divergence. Automatica  2014; 50(11): 2777-86.
 [http://dx.doi.org/10.1016/j.automatica.2014.09.005]

[69]
Ji S, Zhang Z, Ying S, Wang L, Zhao X, Gao Y. Kullback-leibler divergence metric learning. IEEE Trans Cybern  2022; 52(4): 2047-58.
 [http://dx.doi.org/10.1109/TCYB.2020.3008248] [PMID: 32721911]

[70]
Moral S, Cano A, Gómez-Olmedo M. Computation of kullback–leibler divergence in bayesian networks. Entropy  2021; 23(9): 1122.
 [http://dx.doi.org/10.3390/e23091122] [PMID: 34573747]

[71]
Xie J, Girshick R, Farhadi A. Unsupervised deep embedding for clustering analysis. Int Conf Mach Learn PMLR  2016; 478-87.

[72]
Zhang H, Hennig L, Alt C. Bootstrapping named entity recognition in e-commerce with positive unlabeled learning. arXiv  2020; 2020: 11075.
 [http://dx.doi.org/10.18653/v1/2020.ecnlp-1.1]

[73]
Engstrand RD, Moeller G. Confusion matrix analysis for form perception. Hum Factors  1967; 9(5): 439-46.
 [http://dx.doi.org/10.1177/001872086700900507] [PMID: 5582459]

[74]
Baldo P, Fornasier G, Ciolfi L, Sartor I, Francescon S. Pharmacovigilance in oncology. Int J Clin Pharm  2018; 40(4): 832-41.
 [http://dx.doi.org/10.1007/s11096-018-0706-9] [PMID: 30069667]

[75]
Danesi R, De Braud F, Fogli S, Di Paolo A, Del Tacca M. Pharmacogenetic determinants of anti-cancer drug activity and toxicity. Trends Pharmacol Sci  2001; 22(8): 420-6.
 [http://dx.doi.org/10.1016/S0165-6147(00)01742-9] [PMID: 11479005]

[76]
Gewirtz DA, Bristol ML, Yalowich JC. Toxicity issues in cancer drug development. Curr Opin Investig Drugs  2010; 11(6): 612-4.
 [PMID: 20496255]

Cite As

Current Bioinformatics

Transformer-based Named Entity Recognition for Clinical Cancer Drug Toxicity by Positive-unlabeled Learning and KL Regularizers

Abstract

Graphical Abstract