Recent Advances in Robot Visual SLAM

Hongxin      Zhang; Hui      Jin; Shaowei      Ma

Abstract

Background: SLAM plays an important role in the navigation of robots, unmanned aerial vehicles, and unmanned vehicles. The positioning accuracy will affect the accuracy of obstacle avoidance. The quality of map construction directly affects the performance of subsequent path planning and other algorithms. It is the core algorithm of the intelligent mobile application. Therefore, robot vision SLAM has great research value and will be an important research direction in the future.

Objective: By reviewing the latest development and patent of Computer Vision SLAM, this paper provides references to researchers in related fields.

Methods: Computer Vision SLAM patents and literature were analyzed from the aspects of the algorithm, innovation, and application. Among them, there are more than 30 patents and nearly 30 pieces of literature in the past ten years.

Results: This paper reviews the research progress of robot visual SLAM in the last 10 years, summarizes its typical features, especially describes the front part of the visual SLAM system in detail, describes the main advantages and disadvantages of each method, analyses the main problems in the development of robot visual SLAM, prospects its development trend, and finally discusses the related products and patents research status and future of robot visual SLAM technology.

Conclusion: The Robot Vision SLAM can compare the texture information of the environment and identify the difference between the two environments, thus improving accuracy. However, the current SLAM algorithm is easy to fail in fast motion and highly dynamic environments, most SLAM action plans are inefficient, and the image features of VSLAM are too distinguishable. Furthermore, more patents on the Robot Vision SLAM should also be invented.

Graphical Abstract

[1]
L.X. Lin, Y. Ye, J.M. Yao,  and T.L. Guo, "Embedded implementation and opti-mization of mobile robot based on orb-SLAM", Microcomp. Appl., vol. 36, no. 5, pp. 50-53, 2017.
[2]
X. Gao, T. Zhang,  and Y. Liu, Fourteen Lectures on Visual SLAM., Electronic Industry Press: Beijing, 2017, pp. 13-19.
[3]
T. Taketomi, H. Uchiyama,  and S. Ikeda, "Visual SLAM algorithms: A survey from 2010 to 2016", IPSJ Trans. Comp. Vision Appl., vol. 9, no. 1, p. 16, 2017.
 [http://dx.doi.org/10.1186/s41074-017-0027-2]
[4]
X. Gao, "From theory to practice", In: X. Gao, Ed., Fourteen Lectures on Visual SLAM., Electronic Industry Press: Beijing, 2017.
[5]
X. Zou, C.S. Xiao, Y.Q. Wen,  and H.W. Yuan, "Research status of vSLAM based on feature point method and direct method", In: Computer application research, 2020, p. 1-13.
[6]
N. Sunderhauf,  and P. Protzel, Towards a robust back - end for pose graph SLAM. IEEE International Conference on Robotics and Automation, 2012, pp. 1254-1261.
 [http://dx.doi.org/10.1109/ICRA.2012.6224709]
[7]
T. Bailey, J. Nieto, J. Guivant, M. Stevens,  and E. Nebot, Consistency of the EKF-SLAM Algorithm. 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2006, pp. 3562-3568.
 [http://dx.doi.org/10.1109/IROS.2006.281644]
[8]
B.P. Wrobel, "Multiple view geometry in computer vision", Kybernetes, vol. 30, no. 9/10, pp. 1333-1341, 2001.
 [http://dx.doi.org/10.1108/k.2001.30.9_10.1333.2]
[9]
P. Newman,  and K. Ho, SLAM- loop closing with visually salient features. IEEE International Conference on Robotics & Automation, 2005, pp. 635-642.
 [http://dx.doi.org/10.1109/ROBOT.2005.1570189]
[10]
D. Galvez-López,  and J.D. Tardos, "Bags of binary words for fast place recognition in image sequences", IEEE Trans. Robot., vol. 28, no. 5, pp. 1188-1197, 2012.
 [http://dx.doi.org/10.1109/TRO.2012.2197158]
[11]
X.F. Li, N. Zhou,  and L. Zhou, Communication principle., Tsinghua University Press: Beijing, 2011.
[12]
Y.F. Li,  and B.H. Zhu, "Application of UQPSK modulation in broadband data transmission and tracking system", In: Electronic technology and software engineering., 2015.
[13]
H. Cho, E.K. Kim,  and S. Kim, "Indoor SLAM application using geometric and ICP matching methods based on line features", Robot. Auton. Syst., vol. 100, pp. 206-224, 2018.
 [http://dx.doi.org/10.1016/j.robot.2017.11.011]
[14]
F. Gönültaş, M.E. Atı̇k,  and Z. Duran, "Extraction of roof planes from different point clouds using RANSAC algorithm", Int. J. Environ. Geoinformatics, vol. 7, no. 2, pp. 165-171, 2020.
 [http://dx.doi.org/10.30897/ijegeo.715510]
[15]
A.J. Davison, SLAM with a single camera. Proceedings of Workshop on Concurrent Mapping an Localization for Autonomous Mobile Robots in Conjunction with ICRA, 2002, pp. 18-27. Washington, DC, USA
[16]
A.J. Davison, Real-time simultaneous localisation and mapping with a single camera. Proceedings of the Ninth IEEE International Conference on Computer Vision, 2003, pp. 1403-1410.
 Washington, DC, USA [http://dx.doi.org/10.1109/ICCV.2003.1238654]
[17]
A.J. Davison, I.D. Reid, N.D. Molton,  and O. Stasse, "MonoSLAM: Real-time single camera SLAM", IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 6, pp. 1052-1067, 2007.
 [http://dx.doi.org/10.1109/TPAMI.2007.1049] [PMID: 17431302]
[18]
J. Civera, A.J. Davison,  and J. Montiel, "Inverse depth parametrization for monocular SLAM", IEEE Trans. Robot., vol. 24, no. 5, pp. 932-945, 2008.
 [http://dx.doi.org/10.1109/TRO.2008.2003276]
[19]
R. Martinez-Cantin,  and J.A. Castellanos, "Unscented SLAM for large-scale outdoor environments", 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, Edmonton, AB, Canada, 2005, pp. 3427-3432.
 [http://dx.doi.org/10.1109/IROS.2005.1545002]
[20]
D. Chekhlov, M. Puppill, W. Sayol-cueva,  and A. Calway, Real-time and robust monocular SLAM using predictive multi-resolution descriptors. Proceedings of the Second International Conference on Advances in Visual Computi, 2006, pp. 276-285.
 [http://dx.doi.org/10.1007/11919629_29]
[21]
S. Holmes, G. Klein,  and D. Murray, A square root unscented kalman filter for visual monoSLAM. Proceedings of 2008 International Conference on Robotics and Automation, ICRA, 2008, p. 3710-3716.
 Pasadena, California, USA [http://dx.doi.org/10.1109/ROBOT.2008.4543780]
[22]
R. Sim, P. Elinas, M. Griffin,  and J.J. Little, "Vision-based SLAM using the Rao-Blackwellised particle filter", In: IJCAI work-shop on reasoning with uncertainty in robotics, 2005, p. 500-509.
[23]
M. Li, B. Hong, Z. Cai,  and R. Luo, "Novel Rao-Blackwellized particle filter for mobile robot SLAM using monocular vision", Int. J. Autom. Control, vol. 2, no. 3, 2006.
[24]
G. Klein,  and D. Murray, "Parallel tracking and mapping for small AR workspaces", 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, Nara, Japan, 2007.
 [http://dx.doi.org/10.1109/ISMAR.2007.4538852]
[25]
R. Mur-Artal, J.M.M. Montiel,  and J.D. Tardos, "ORB-SLAM: A versatile and accurate monocular SLAM system", In: Robotics IEEE Transactions on, vol. 31. 2015, p. 1147-1163.
 [http://dx.doi.org/10.1109/TRO.2015.2463671]
[26]
Y.J. Zhang, B. Li,  and D.C. Huang, "A robot positioning and map building method and a robot", CN. Patent 109583457A, 2018.
[27]
Y. Liu, F. Wang, Y. W. Xia, C. F. Zhang,  and W. Zhang, "Panoramic inertial navigation SLAM method based on multiple key frames", CN. Patent 109307508A, 2018.
[28]
R. Wang, W.Z. Cha, J.J. Ge, F.L. Meng,  and X.R. Meng, "A visual SLAM method based on semantic constraint", CN. Patent 109815847A, 2018.
[29]
B. You,  and Q. Liang, "A vision SLAM method based on multi feature fusion", CN. Patent 110060277A, 2019.
[30]
J.Q. Feng, R.J. Xu, X. Zhao,  and C. Zhu, "Visual SLAM key frame and feature point selection method based on feature point distribution", CN. Patent 110070577A, 2019.
[31]
S.P. Ding, N.C. He, Z.L. He, X. Yao,  and Q.Y. Zhang, "Visual SLAM method based on instance segmentation", CN. Patent 110738673A, 2019.
[32]
L.Y. Cui, Z.H. Guo,  and C.W. Ma, "Visual SLAM method based on semantic optical flow and inverse depth filtering", CN. Patent 111311708A, 2020.
[33]
B.T. Zhang, C.Y. Lee, H. Lee,  and I. Hwang, "Method and apparatus for enhancing image feature point in visual SLAM by using object label", KR. Patent WO2020111844A2, 2019.
[34]
Y.Q. Liu, X.L. Zhang, J.M. Li, Y.Z. Gu,  and D.D. Yang, "Tightly coupled binocular vision-inertial SLAM method using combined point-line features", CN. Patent 109579840A, 2018.
[35]
K.X. Xing, W. Wan, Y.G. Lin, C. Guo,  and C.T. Feng, "Robot positioning and map construction system based on binocular vision features and IMU information", CN. Patent 108665540A, 2018.
[36]
E. Rublee, V. Rabaud, K. Konolige,  and G. Bradski, ORB: an efficient alternative to SIFT or SURF. Proceedings of 2011 International Conference on Computer Vision, 2011, pp. 2564-2571.
 Barcelona, Spain [http://dx.doi.org/10.1109/ICCV.2011.6126544]
[37]
R. Mur-Artal, J.M.M. Montiel,  and J.D. Tardós, "ORB-SLAM: A Versatile and Accurate Monocular SLAM System", In: IEEE Trans. Robot., vol. 31. 2015, no. 5, pp. 1147-1163.
 [http://dx.doi.org/10.1109/TRO.2015.2463671]
[38]
R. Mur-Artal,  and J.D. Tardós, "ORB-SLAM2: an opensource SLAM system for monocular, stereo, and RGB-D cameras", IEEE Trans. Robot., vol. 33, no. 5, pp. 1255-1262, 2017.
 [http://dx.doi.org/10.1109/TRO.2017.2705103]
[39]
C. Forster, Z. Zhang, M. Gassner, M. Werlberger,  and D. Scaramuzza, "SVO: semidirect visual odometry for monocular and multicamera systems", IEEE Trans. Robot., vol. 33, no. 2, pp. 249-265, 2017.
 [http://dx.doi.org/10.1109/TRO.2016.2623335]
[40]
S.Y. Loo, A.J. Amiri, S. Mashohor, S.H. Tang,  and H. Zhang, "CNN-SVO: Improving the mapping in semi-direct visual odometry using single-image depth prediction", arXiv:1810.01011, 2020.
[41]
G. Zhang, H. Liu, Z. Dong, J. Jia, T-T. Wong,  and H. Bao, "Efficient non-consecutive feature tracking for robust structure-from-motion", IEEE Trans. Image Process., vol. 25, no. 12, pp. 5957-5970, 2016.
 [http://dx.doi.org/10.1109/TIP.2016.2607425] [PMID: 27623586] [http://dx.doi.org/10.1109/TPAMI.2017.2658577] [PMID: 28422651]
[42]
J. Engel, V. Koltun,  and D. Cremers, "Direct sparse odometry", IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 3, pp. 611-625, 2018.
 [http://dx.doi.org/10.1109/TPAMI.2017.2658577] [PMID: 28422651]
[43]
D. Schlegel, M. Colosi,  and G. Grisetti, "ProSLAM: graph SLAM from a programmer’s perspective", arXiv:1709.04377, 2017.
[44]
S. Sumikura, M. Shibuya,  and K. Sakurada, OpenVSLAM: A versatile visual SLAM framework. Proceedings of the 27th ACM International Conference on Multimedia, 2019.
 Nice, France [http://dx.doi.org/10.1145/3343031.3350539]
[45]
J. Engel, T. Schöps,  and D. Cremers, LSD-SLAM: large-scale direct monocular SLAM. Proceedings of the 13th European Conference on Computer Vision, 2014. Zurich, Switzerland
[46]
J. Engel, J. Stückler,  and D. Cremers, Large-scale direct SLAM with stereo cameras. Proceedings of 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2015, pp. 1935-1942. Hamburg, Germany
[47]
S.L. He, "Athlete positioning system based on visual SLAM algorithm", CN. Patent 112950716A, 2021.
[48]
L. Ma, H. Jiang, X.Z. Tan,  and B. Wang, "Visual positioning method of sparse three-dimensional point cloud chart based on VSLAM", CN. Patent 110889349A, 2019.
[49]
Y.L. Hu, Z.H. Yan, L.L. Sun, Z.H. Li,  and X.Y. Liu, "Implementation of 3D sparse point cloud to 2D grid map based on VSLAM", CN. Patent 110675307A, 2019.
[50]
X. Wang, Z. H. Xiao,  and D. G. Guan, "Visual positioning method based on ORB sparse point cloud and two-dimensional code", CN. Patent 107830854A, 2017.
[51]
J. Engel, T. Schps,  and D. Cremers, LSD-SLAM: Large-scale direct monocular SLAM. European Conference on Computer Vision, 2014, pp. 834-849.
 [http://dx.doi.org/10.1007/978-3-319-10605-2_54]
[52]
D. Weikersdorfer, R. Hoffmann,  and J. Conradt, Simultaneous localization and mapping for event-based vision systems. Proceedings of the 9th International Conference on Computer Vision Systems, 2013, pp. 133-142.
 Petersburg, Russia [http://dx.doi.org/10.1007/978-3-642-39402-7_14]
[53]
D. Caruso, J. Engel,  and D. Cremers, "Large-scale direct SLAM for omnidirectional cameras", 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 2015, pp. 141-148.
 [http://dx.doi.org/10.1109/IROS.2015.7353366]
[54]
D. Weikersdorfer, R. Hoffmann,  and J. Conradt, "Simultaneous localization and mapping for event-based vision systems", Proceedings of the 9th International Conference on Computer Vision Systems, Petersburg, Russia, 2013, pp. 133-142.
 [http://dx.doi.org/10.1007/978-3-642-39402-7_14]
[55]
D. Weikersdorfer, D.B. Adrian, D. Cremers,  and J. Conradt, "Event-based 3D SLAM with a depth-augmented dynamic vision sensor", 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, 2014, pp. 359-364.
 [http://dx.doi.org/10.1109/ICRA.2014.6906882]
[56]
R.A. Newcombe, S.J. Lovegrove,  and A.J. Davison, "DTAM: Dense tracking and mapping in real-time", 2011 International Conference on Computer Vision, Barcelona, Spain, 2011, pp. 2320-2327.
 [http://dx.doi.org/10.1109/ICCV.2011.6126513]
[57]
Y. Zhou, G. Gallego, H. Rebecq, L. Kneip, H.D. Li,  and D. Scaramuzza, Semi-dense 3D reconstruction with a stereo event camera. Proceedings of the 15th European Conference on Computer Vision, 2018, pp. 242-258.
 Munich, Germany [http://dx.doi.org/10.1007/978-3-030-01246-5_15]
[58]
M. Dong, M.F. Pei,  and S. Bi, "Method for creating semi-dense cognitive map for binocular SLAM (simultaneous localization and mapping)", CN. Patent 108151728A, 2017.
[59]
J.J. Ni, Y. Yang, J.X. Zhu,  and P.F. Shi, "Mobile robot semi-dense map construction method based on monocular vision", CN. Patent 111860651A, 2020.
[60]
Y. H. Pan, "SLAM-based narrow-lane passing obstacle detection method", CN. Patent 110378919A.
[61]
X.Z. Chen, L.X. Wang, Q.Q. Mao,  and M. Zhou, "Monocular camera imaging semi-dense mapping method and device, and storage medium", CN. Patent 113902859A, 2021.
[62]
T. Whelan, M. Kaess, H. Johannsson, M. Fallon, J.J. Leonard,  and J. McDonald, "Real-time large-scale dense RGB-D SLAM with volumetric fusion", Int. J. Robot. Res., vol. 34, no. 4-5, pp. 598-626, 2015.
 [http://dx.doi.org/10.1177/0278364914551008]
[63]
W.N. Greene, K. Ok, P. Lommel,  and N. Roy, "Multi-level mapping: Real-time dense monocular SLAM", 2016 IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 2016, pp. 833-840.
 [http://dx.doi.org/10.1109/ICRA.2016.7487213]
[64]
C. Kerl, J. Sturm,  and D. Cremers, "Robust odometry estimation for RGB-D cameras", 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, 2013, pp. 3748-3754.
 [http://dx.doi.org/10.1109/ICRA.2013.6631104]
[65]
R.A. Newcombe, D. Fox,  and S.M. Seitz, "DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time", 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 2015, pp. 343-352.
 [http://dx.doi.org/10.1109/CVPR.2015.7298631]
[66]
M. Innmann, M. Zollhfer, M. Niener, C. Theobalt,  and M. Stamminger, Volumedeform: Real-time volumetric non-rigid reconstruction. Proceedings of the 14th European Conference on Computer Vision, 2016, pp. 362-379. Amsterdam, The Netherlands
[67]
M. Dou, S. Khamis, Y. Degtyarev, P. Davidson, S.R. Fanello, A. Kowdle, S.O. Escolano, C. Rhemann, D. Kim, J. Taylor, P. Kohli, V. Tankovich,  and S. Izadi, "Fusion4D", ACM Trans. Graph., vol. 35, no. 4, pp. 1-13, 2016.
 [http://dx.doi.org/10.1145/2897824.2925969]
[68]
T. Whelan,  and S. Leutenegger, "Elasticfusion: Dense SLAM without a pose graph", In: Proceedings of Robotics: Science and Systems, Rome, Italy, 2015.
[69]
T. Whelan, R.F. Salas-Moreno, B. Glocker, A.J. Davison,  and S. Leutenegger, "ElasticFusion: Real-time dense SLAM and light source estimation", Int. J. Robot. Res., vol. 35, no. 14, pp. 1697-1716, 2016.
 [http://dx.doi.org/10.1177/0278364916669237]
[70]
O. Kähler, V.A. Prisacariu,  and D.W. Murray, "Realtime large-scale dense 3D reconstruction with loop closure", Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016, pp. 500-516.
[71]
V.A. Prisacariu, O. Khler, S. Golodetz, M. Sapienza,  and D.W. Murray, "InfiniTAM v3: A framework for large-scale 3D reconstruction with loop closure", arXiv:1708.00783, 2017.
[72]
O. Kähler, V. Adrian Prisacariu, C. Yuheng Ren, X. Sun, P. Torr,  and D. Murray, "Very high frame rate volumetric integration of depth images on mobile devices", IEEE Trans. Vis. Comput. Graph., vol. 21, no. 11, pp. 1241-1250, 2015.
 [http://dx.doi.org/10.1109/TVCG.2015.2459891] [PMID: 26439825]
[73]
F. Endres, J. Hess, J. Sturm, D. Cremers,  and W. Burgard, "3-D mapping with an RGB-D camera", IEEE Trans. Robot., vol. 30, no. 1, pp. 177-187, 2014.
 [http://dx.doi.org/10.1109/TRO.2013.2279412]
[74]
Y. Gao, H.C. Luo, Y.H. Wu,  and X. Yang, "Real-time dense monocular SLAM method and system based on online learning depth prediction network", CN. Patent 107945265A, 2017.
[75]
Y. Tang, F. Qian, W.L. Du,  and W. Du, "Semantic mapping method based on visual SLAM and two-dimensional semantic segmentation", CN. Patent 111462135A, 2020.
[76]
J. Levinson,  and S. Thrun, "Automatic, “Online Calibration of Cameras and Lasers”", Robot. Sci. Syst., 2013.
 [http://dx.doi.org/10.15607/RSS.2013.IX.029]
[77]
J. Zhang,  and S. Singh, "Visual-lidar odometry and mapping: low-drift, robust, and fast", 2015 IEEE International Conference on Robotics and Automation, Seattle, WA, USA, 2015, pp. 2174-2181.
 [http://dx.doi.org/10.1109/ICRA.2015.7139486]
[78]
J. Graeter, A. Wilczynski,  and M. Lauer, "LIMO: Lidar-Monocular Visual Odometry", 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, Madrid, Spain, 2018, pp. 7872-7879.
 [http://dx.doi.org/10.1109/IROS.2018.8594394]
[79]
W. Z. Shao, S. Vijayarangan, C. Li,  and G. Kantor, "Stereo visual inertial lidar simultaneous localization and mapping", arXiv preprint
arXiv, 1902-10741, 2019.
 [http://dx.doi.org/10.1109/IROS40897.2019.8968012]
[80]
H. Zhou, D. Zou, L. Pei, R. Ying, P. Liu,  and W. Yu, "StructSLAM: Visual SLAM with building structure lines", IEEE Trans. Vehicular Technol., vol. 64, no. 4, pp. 1364-1375, 2015.
 [http://dx.doi.org/10.1109/TVT.2015.2388780]

Cite As

Recent Advances in Computer Science and Communications

Recent Advances in Robot Visual SLAM

Abstract

Graphical Abstract