Recent Advances in Computer Science and Communications

Author(s): Alisha Banga*, Ravinder Ahuja and Subhash C. Sharma

DOI: 10.2174/2666255813999200628094351

DownloadDownload PDF Flyer Cite As
Stacking Regression Algorithms to Predict PM2.5 in the Smart City Using Internet of Things

Article ID: e111022183236 Pages: 17

  • * (Excluding Mailing and Handling)

Abstract

Background: With the increase in populations in urban areas, there is an increase in pollution also. Air pollution is one of the challenging environmental issues in smart cities.

Objective: Real-time monitoring of air quality can help the administration to take appropriate decisions on time. Advancement in the Internet of Things based sensors has changed the way to monitor air quality.

Methods: In this paper, we have applied two-stage regressions. At the first stage, ten regression algorithms (Decision Tree, Random Forest, Elastic Net, Adaboost, Extra Tree, Linear Regression, Lasso, XGBoost, Light GBM, AdaBoost, and Multi-Layer Perceptron) are applied and at second stage best four algorithms are selected and stacking ensemble algorithms are applied using python to predict the PM2.5 pollutants in the air. Dataset of five Chinese cities (Beijing, Chengdu, Guangzhou, Shanghai, and Shenyang) is taken into consideration and compared based on MAE (Mean Absolute Error), RMSE (Root Mean Square Error) and R2 parameters.

Results: We observed that out of ten regression algorithms applied, extra tree algorithm exhibited the best performance on all the five datasets, and further stacking improved the performance.

Conclusion: Feature importance for Sheyang and Beijing city was computed using three regression algorithms, and we found that the four most important features are humidity, wind speed, wind direction and dew point.

Keywords: AQI, regression, deep learning, imputation techniques, PM2.5, machine learning, ensemble learning, IoT, smart city.