Lightweight Underwater Target Detection Method based on Improved YOLOv5s

Yongfa      Mi; Mingshan      Chi; Qiang      Zhang; Pengjie      Liu; Fangyang      Sun

Abstract

Introduction: In the target detection technology of underwater robots, many patents and papers have aimed to enhance the accuracy of underwater target detection, but limited resources in underwater robots overlook lightweight detection methods.

Method: In this study, we proposed an underwater target detection method using lightweight devices while ensuring high accuracy that could be maintained with limited resources. Our proposed algorithm leveraged the Ghost lightweight network, EMA mechanism, and CARAFE up-sampling technology to enhance YOLOv5s. To validate our method, comparative experiments, visual analysis, and ablation experiments were conducted.

Results: The experimental results showed that our algorithm had a model size of only 9.7 M, with 4.38×10⁶ parameters and a computational volume of 8.4 GFLOPs. Precision, recall, and mAP@0.5 increased by 4.2%, 2.2%, and 2.5%, respectively.

Conclusion: Our improved algorithm provided an efficient and accurate solution for underwater robot target detection technology.

Keywords: Underwater target detection, lightweight, YOLOv5s, Ghost, EMA, CARAFE.

Graphical Abstract

[1]
R. Zhang, J. Shao, Z. Nie, Z. Lü, Y. Wang,  and S. Sun, "Underwater long-distance imaging method based on combination of short coherent illumination and polarization", Opt. Precis. Eng., vol. 28, no. 7, pp. 1485-1493, 2020.
 [http://dx.doi.org/10.37188/OPE.20202807.1485]
[2]
K. Liu, Q. Sun, D. Sun, L. Peng, M. Yang,  and N. Wang, "Underwater target detection based on improved YOLOv7", J. Mar. Sci. Eng., vol. 11, no. 3, p. 677, 2023.
 [http://dx.doi.org/10.3390/jmse11030677]
[3]
J.N. Dong, M. Yang, Z.R. Xie,  and L.P. Cai, "Overview of underwater image object detection data set and detection algorithms", J. Atmos. Ocean. Technol., vol. 41, no. 5, pp. 60-72, 2022.
[4]
Y. Xiao, Z. Tian, J. Yu, Y. Zhang, S. Liu, S. Du,  and X. Lan, "A review of object detection based on deep learning", Multimedia Tools Appl., vol. 79, no. 33-34, pp. 23729-23791, 2020.
 [http://dx.doi.org/10.1007/s11042-020-08976-6]
[5]
R.L. Yao, Y.W. Gui,  and Q.G. Huang, "Recognition of freshwater fish species based on machine vision", Cyber. Secur. Data. Gov, vol. 36, no. 24, pp. 37-39, 2017.
[6]
X. Qiao, J. Bao, H. Zhang, F. Wan,  and D. Li, "fvUnderwater sea cucumber identification based on Principal Component Analysis and Support Vector Machine", Measurement, vol. 133, pp. 444-455, 2019.
 [http://dx.doi.org/10.1016/j.measurement.2018.10.039]
[7]
X. Shi, H. Huang, B. Wang, S. Pang,  and H. Qin, "Underwater cage boundary detection based on GLCM features by using SVM classifier", In IEEE ASME International Conference on Advanced Intelligent Mechatronics (AIM), 2019, pp. 1169-1174 
 [http://dx.doi.org/10.1109/AIM.2019.8868517]
[8]
S.S.A. Zaidi, M.S. Ansari, A. Aslam, N. Kanwal, M. Asghar,  and B. Lee, "A survey of modern deep learning based object detection models", Digit. Signal Process., vol. 126, p. 103514, 2022.
 [http://dx.doi.org/10.1016/j.dsp.2022.103514]
[9]
S. Ren, K. He, R. Girshick,  and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks", Adv. Neural Inf. Process. Syst., vol. 28, pp. 1-9, 2015.
[10]
T.Y. Lin, P. Goyal, R. Girshick, K.M. He,  and P. Dollár, "Focal loss for dense object detection", Proceedings of the IEEE international conference on computer vision, pp. 2980-2988, 2017.
[11]
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.Y. Fu,  and A.C. Berg, "Ssd: Single shot multibox detector", European Conference on Computer Vision, 2016.
 [http://dx.doi.org/10.1007/978-3-319-46448-0_2]
[12]
X. Zhu, S. Lyu, X. Wang,  and Q. Zhao, "TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios", Proceedings of the IEEE/CVF international conference on computer vision, 2021.
 [http://dx.doi.org/10.1109/ICCVW54120.2021.00312]
[13]
C.Y. Li, L.L. Li, H.L. Jiang, K.H. Weng, Y.F. Geng, L. Liang, Z.D. Ke, Q.Y. Li, M. Cheng, W.Q. Nie, Y.D. Li, B. Zhang, Y.F. Liang, L.Y. Zhou, X.M. Xu, X.X. Chu, X.M. Wei,  and X.L. Wei, "YOLOv6: A single-stage object detection framework for industrial applications",  arXiv, vol. 2022, p. 02976, 2022.
[14]
C.Y. Wang, A. Bochkovskiy,  and H.Y.M. Liao, "YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
 [http://dx.doi.org/10.1109/CVPR52729.2023.00721]
[15]
Y.J. Gao, "Design and implementation of underwater target detection network based on ssd model", Electron. World., vol. 8, pp. 110-111, 2019.
[16]
Y. Zhang, X. Li, Y. Sun,  and S. Liu, "Underwater object detection algorithm based on channel attention and feature fusion", Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, vol. 40, no. 2, pp. 433-441, 2022.
 [http://dx.doi.org/10.1051/jnwpu/20224020433]
[17]
X. Li, M. Shang, H. Qin,  and L. Chen, "Fast accurate fish detection and recognition of underwater images with Fast R-CNN", OCEANS 2015 - MTS/IEEE Washington. Washington, DC, 19-22 October 2015, pp. 1-5.
[18]
X. Li, M. Shang, J. Hao,  and Z. Yang, "Accelerating fish detection and recognition by sharing CNNs with objectness learning", OCEANS 2016 - Shanghai. Shanghai, China, 10-13 April 2016, pp. 1-5.
[19]
P.F. Shi, S. Han, J.J. Ni,  and X. Yang, "Underwater object detection algorithm combining data enhancement and improved YOLOv4", J. Electron. Meas. Instrum., vol. 36, no. 3, pp. 113-121, 2022.
[20]
W. Qiang, Y. He, Y. Guo, B. Li,  and L. He, "Exploring underwater target detection algorithm based on improved SSD", Xibei Gongye Daxue Xuebao/J. Northwest. Polytechn. Univ., vol. 38, no. 4, pp. 747-754, 2020.
 [http://dx.doi.org/10.1051/jnwpu/20203840747]
[21]
Z.B. Ye, X.H. Duan,  and C. Zhao, "Research on underwater target detection by improved YOLOV3-SPP", Comput. Eng. Appl, vol. 59, no. 6, pp. 231-240, 2023.
[22]
G. Wen, S. Li, F. Liu, X. Luo, M.J. Er, M. Mahmud,  and T. Wu, "YOLOv5s-CA: A modified yolov5s network with coordinate attention for underwater target detection", Sensors, vol. 23, no. 7, p. 3367, 2023.
 [http://dx.doi.org/10.3390/s23073367] [PMID: 37050427]
[23]
X. Fan, L. Lu, P. Shi,  and X. Zhang, "A novel sonar target detection and classification algorithm", Multimedia Tools Appl., vol. 81, no. 7, pp. 10091-10106, 2022.
 [http://dx.doi.org/10.1007/s11042-022-12054-4]
[24]
Z. Bao, Y. Guo, J. Wang, L. Zhu, J. Huang,  and S. Yan, "Underwater target detection based on parallel high-resolution networks", Sensors, vol. 23, no. 17, p. 7337, 2023.
 [http://dx.doi.org/10.3390/s23177337] [PMID: 37687793]
[25]
Y. Li, X. Bai,  and C. Xia, "An improved yolov5 based on triplet attention and prediction head optimization for marine organism detection on underwater mobile platforms", J. Mar. Sci. Eng., vol. 10, no. 9, p. 1230, 2022.
 [http://dx.doi.org/10.3390/jmse10091230]
[26]
J.L. Jiang, Y. Wang, Q. Jia, S. Xu, Y. Liu, X. Fan, H. Li, R. Liu, X. Xue,  and R. Wang, " Underwater Species Detection using Channel Sharpening Attention", Proceedings of the 29th ACM International Conference on Multimedia (MM '21). Association for Computing Machinery.
 New York, NY, USA, pp.4259–4267. [http://dx.doi.org/10.1145/3474085.3475563]
[27]
Y. Yao, Z. Qiu,  and M. Zhong, "Application of improved mobilenet-ssd on underwater sea cucumber detection robot", IEEE Electronic and Automation Control Conference (IAEAC).
 Chengdu, China, 2019, pp. 402-407. [http://dx.doi.org/10.1109/IAEAC47372.2019.8997970]
[28]
Q. Wei,  and W. Chen, "Underwater object detection of an uvms based on wgan", IEEE China Automation Congress (CAC).
 Beijing, China, 2021, pp. 702-707 [http://dx.doi.org/10.1109/CAC53003.2021.9727904]
[29]
T.Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan,  and S. Belongie, "Feature pyramid networks for object detection", Proceedings of the IEEE conference on computer vision and pattern recognition. Honolulu, HI, USA, 2017, pp. 936-944.
[30]
S. Liu, L. Qi, H. Qin, J. Shi,  and J. Jia, "Path aggregation network for instance segmentation", Proceedings of the IEEE conference on computer vision and pattern recognition. Salt Lake City, UT, USA, 2018, pp. 8759-8768.
[31]
K. Han, Y. Wang, Q. Tian, J. Guo, C. Xu,  and C. Xu, "GhostNet: More features from cheap operations", Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, USA, 2020, pp. 1577-1586.
[32]
D. Ouyang, S. He, G. Zhang, M. Luo, H. Guo, J. Zhan,  and Z. Huang, "Efficient multi-scale attention module with cross-spatial learning", IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
 Rhodes Island, Greece, 2023, pp. 1-5. [http://dx.doi.org/10.1109/ICASSP49357.2023.10096516]
[33]
J.Q. Wang, K. Chen, R. Xu, Z.W. Liu, C.C. Loy,  and D.H. Lin, "Carafe: Content-aware reassembly of features", Proceedings of the IEEE/CVF international conference on computer vision. Seoul, Korea (South), 2019, pp. 3007-3016.
[34]
S. Woo, J. Park, J.Y. Lee,  and I.S. Kweon, "Cbam: Convolutional block attention module", Proceedings of the European conference on computer vision (ECCV), vol. 11211, 2018.
[35]
S. Luo, X.L. Meng, M.Y. Zhu, W.L. Wang, H.L. Song, X.F. Wang,  and H. Zheng, "Surface defect detection method of high temperature continuous casting billet based on improved yolov5x network model.", C.N. Patent 113487570. 2021.
[36]
Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo,  and Q. Hu, "ECA-Net: Efficient channel attention for deep convolutional neural networks", Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.
 Seattle, WA, USA, 2020, pp. 11531-11539 [http://dx.doi.org/10.1109/CVPR42600.2020.01155]
[37]
Q.B. Hou, D.Q. Zhou,  and J.S. Feng, "Coordinate attention for efficient mobile network design", Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.
 Nashville, TN, USA, 2021, pp. 13708-13717. [http://dx.doi.org/10.1109/CVPR46437.2021.01350]
[38]
B. Kang, C. Hou, Z.H. Xu, G.L. Ding, Z.Y. Wang, X.W. Zhang,  and J.H. Sang, "GIS Infrared feature recognition system and method based on improved YOLOv5.", C.N. Patent 116342894. 2021.
[39]
Y.P. Chen, Y. Kalantidis, J.S. Li, S.C. Yan,  and J.S. Feng, "A^2-nets: Double attention networks", arXiv, vol. 1810, p. 11579, 2018.
[40]
X.L. Wang, R. Girshick, A. Gupta,  and K.M. He, "Non-local neural networks", Proceedings of the IEEE conference on computer vision and pattern recognition. Salt Lake City, UT, USA, 2018, pp. 7794-7803.
[41]
Y.M. Dai, F. Gieseke, S. Oehmcke, Y.Q. Wu,  and K. Barnard, "Attentional feature fusion", Proceedings of the IEEE/CVF winter conference on applications of computer vision. Waikoloa, HI, USA, 2021, pp. 3559-3568.
[42]
H. Noh, S. Hong,  and B. Han, "Learning deconvolution network for semantic segmentation", Proceedings of the IEEE international conference on computer vision. Santiago, Chile, 2015, pp. 1520-1528.
[43]
C.H. Wu, W.H. Luo, X. Xu,  and K. Xing, "A vehicle target detection method fusing gam, carafe and sniou.", C.N. Patent 115588126. 2023.
[44]
"Zhan J Municipal People’s Government",  Available from: https://www.heywhale.com/home/competition/5e535a612537a0002ca864ac/content/
[45]
B. Zhou, A. Khosla, A. Lapedriza, A. Oliva,  and A. Torralba, "Learning deep features for discriminative localization", Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, NV, USA, 2016, pp. 2921-2929.

Cite As

Recent Patents on Engineering

Lightweight Underwater Target Detection Method based on Improved YOLOv5s

Abstract

Graphical Abstract