Abstract
This study conducts a comprehensive review of Deep Learning-based approaches
for accurate object segmentation and detection in high-resolution imagery captured by Unmanned
Aerial Vehicles (UAVs). The methodology employs three different existing algorithms
tailored to detect roads, buildings, trees, and water bodies. These algorithms include
Res-UNet for roads and buildings, DeepForest for trees, and WaterDetect for water bodies. To
evaluate the effectiveness of this approach, the performance of each algorithm is compared
with state-of-the-art (SOTA) models for each class. The results of the study demonstrate that
the methodology outperforms SOTA models in all three classes, achieving an accuracy of 93%
for roads and buildings using Res-U-Net, 95% for trees using DeepForest, and an impressive
98% for water bodies using Water Detect. The paper utilizes a Deep Learning-based approach
for accurate object segmentation and detection in high-resolution UAV imagery, achieving superior
performance to SOTA models, with reduced overfitting and faster training by employing
three smaller models for each task.
Graphical Abstract
[2]
B. Bansod, R. Singh, R. Thakur, and G. Singhal, "A comparision between satellite based and drone based remote sensing technology to achieve sustainable development: A review", J. Agric. Environ. Int. Dev., vol. 111, no. 2, pp. 383-407, 2017.
[14]
S. Takemoto, Moving towards climate-smart flood management in Bangkok and Tokyo, 2011.
[16]
A. Krizhevsky, I. Sutskever, and E.H. Geoffrey, "Imagenet classification with deep convolutional neural networks", Adv. Neural Inf. Process. Syst., vol. 25, pp. 1097-1105, 2012.
[17]
K. Simonyan, and A. Zisserman, "Very deep convolutional networks for large-scale image recognition", arXiv:1409.1556, 2014.,
[19]
R. Girshick, "Fast R-CNN Proc. IEEE Int. Conf. Comput. Vis., pp. 1440-1448, 2015",
[20]
S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards real-time object detection with region proposal networks", Adv. Neural Inf. Process. Syst., pp. 91-99, 2015.
[22]
F. Yu, and V. Koltun, "Multi-scale context aggregation by dilated convolutions", arXiv:1511.07122, 2015.,
[35]
V. Mnih, Machine Learning for Aerial Image Labeling., Library and Archives Canada: Ottawa, 2014.