A Sequential Ensemble Model for Communicable Disease Forecasting

Page: [309 - 317] Pages: 9

  • * (Excluding Mailing and Handling)

Abstract

Background: Ensemble building is a popular method for improving model accuracy for classification problems as well as regression.

Objective: In this research work, we propose a sequential ensemble model to predict the number of incidences for communicable diseases like influenza, hand foot and mouth disease (HFMD), and diarrhea and compare it with applied models for prediction.

Methods: The weekly dataset of the three diseases, namely, influenza, HFMD, and diarrhea, are collected from the official government site of Hong Kong from the year 2010 to 2018. The data was preprocessed by taking log transformation and z-score transformation. The proposed sequential ensemble model is applied to the processed dataset to predict future occurrences.

Results: The result of the proposed ensemble model is compared against standard support vector regression (SVR) using different error metrics such as root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE). In the case of all the threedisease datasets, the proposed ensemble model gives better results in comparison to the standard SVR model.

Conclusion: The main objective of this research work is to minimize the prediction error; the proposed sequential ensemble model has shown a significant result in terms of prediction errors.

Keywords: Influenza, hand foot and mouth disease, diarrhea, neural network autoregression, support vector regression, ensemble.

Graphical Abstract

[1]
Brookmeyer R, Johnson E, Ziegler-Grahamm K, Arrighi HM. O1-02-01: Forecasting the global prevalence and burden of Alzheimer’s disease. Alzheimers Dement 2007; 3(3): S168.
[http://dx.doi.org/10.1016/j.jalz.2007.04.087] [PMID: 19595937]
[2]
Heidenreich PA, Trogdon JG, Khavjou OA, et al. Forecasting the future of cardiovascular disease in the United States: a policy statement from the American Heart Association. Circulation 2011; 123(8): 933-44.
[http://dx.doi.org/10.1161/CIR.0b013e31820a55f5] [PMID: 21262990]
[3]
Dugas AF, Jalalpour M, Gel Y, et al. Influenza forecasting with Google flu trends. PLoS One 2013; 8(2) e56176
[http://dx.doi.org/10.1371/journal.pone.0056176] [PMID: 23457520]
[5]
Chen CC, Lin BC, Yap L, Chiang PH, Chan TC. The association between ambient temperature and acute diarrhea incidence in Hong Kong, Taiwan, and Japan. Sustainability 2018; 10(5): 1417.
[http://dx.doi.org/10.3390/su10051417]
[6]
Peng Y, Yu B, Wang P, Kong DG, Chen BH, Yang XB. Application of seasonal auto-regressive integrated moving average model in forecasting the incidence of hand-foot-mouth disease in Wuhan, China. J Huazhong Univ Sci Technolog Med Sci 2017; 37(6): 842-8.
[PMID: 29270741]
[7]
Shashvat K, Basu R, Bhondekar AP, Kaur A. A weighted ensemble model for prediction of infectious diseases. Curr Pharm Biotechnol 2019; 20(8): 674-8.
[http://dx.doi.org/10.2174/1389201020666190612160631] [PMID: 31203798]
[8]
Sultana N, Sharma N. Statistical models for predicting Swine F1u incidences in India. First International Conference on Secure Cyber Computing and Communication (ICSCCC) 2018; 134-8.
[http://dx.doi.org/10.1109/ICSCCC.2018.8703300]
[9]
Wolpert DH. Stacked generalization. Neural Netw 1992; 5(2): 241-59.
[http://dx.doi.org/10.1016/S0893-6080(05)80023-1]
[10]
Lin JR, Mondal AM, Liu R, Hu J. Minimalist ensemble algorithms for genome-wide protein localization prediction. BMC Bioinformatics 2012; 13(1): 157.
[http://dx.doi.org/10.1186/1471-2105-13-157] [PMID: 22759391]
[11]
Lin C, Zou Y, Qin J, et al. Hierarchical classification of protein folds using a novel ensemble classifier. PLoS One 2013; 8(2) e56499
[http://dx.doi.org/10.1371/journal.pone.0056499] [PMID: 23437146]
[12]
Hong Kong Department of Health Flu Express available from:. http://www.chp.gov.hk/en/healthtopics/24/index.html (Accessed on May 25, 2019).
[13]
Ray EL, Reich NG. Prediction of infectious disease epidemics via weighted density ensembles. PLOS Comput Biol 2018; 14(2) e1005910
[http://dx.doi.org/10.1371/journal.pcbi.1005910] [PMID: 29462167]
[14]
Polikar R. Ensemble based systems in decision making. IEEE Circuits Syst Mag 2006; 6(3): 21-45.
[http://dx.doi.org/10.1109/MCAS.2006.1688199]
[15]
Jahrer M, Töscher A, Legenstein R. Combining predictions for accurate recommender systems. Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining 2010; 693-702.
[16]
Shashvat K, Basu R, Bhondekar AP, Lamba S, Verma K, Kaur A. Comparison of time series models predicting trends in typhoid cases in northern India. Southeast Asian J Trop Med Public Health 2019; 50(2): 347-56.
[17]
Chai T, Draxler RR. Root Mean Square Error (RMSE) or Mean Absolute Error (MAE)?–Arguments against avoiding RMSE in the literature. Geosci Model Dev 2014; 7(3): 1247-50.
[http://dx.doi.org/10.5194/gmd-7-1247-2014]
[18]
De Myttenaere A, Golden B, Le Grand B, Rossi F. Mean absolute percentage error for regression models. Neurocomputing 2016; 192: 38-48.
[http://dx.doi.org/10.1016/j.neucom.2015.12.114]
[19]
Soebiyanto RP, Adimi F, Kiang RK. Modeling and predicting seasonal influenza transmission in warm regions using climatological parameters. PLoS One 2010; 5(3) e9450
[http://dx.doi.org/10.1371/journal.pone.0009450]