A visual SLAM method for merging semantic information in indoor dynamic scenes

doi:10.13474/j.cnki.11-2246.2025.0402

Abstract

Abstract: Visual SLAM is a core technology for autonomous perception and navigation in intelligent devices, playing a crucial role in AI and robotics. However, traditional visual SLAM algorithms suffer significantly in stability and localization accuracy when scenes contain moving objects. To address this, this paper proposes a SLAM scheme that integrates semantic information for indoor dynamic scenarios. Based on ORB-SLAM2, it introduces the GCNv2 network for deep feature extraction and YOLOv5 for semantic segmentation to identify dynamic objects. Combined with motion consistency analysis, it effectively eliminates dynamic interference, enhancing robustness. Tests on the TUM standard dataset show the improved algorithm significantly outperforms the original ORB-SLAM2 in dynamic indoor environments, with an average positioning accuracy improvement of 55.75%. This result demonstrates the proposed method's effectiveness, significantly boosting SLAM system performance in complex dynamic environments.

Key words: visual SLAM, semantic information, feature extraction, dynamic scenes

CLC Number:

P208

WANG Yizhe, ZHANG Ruiju, WANG Jian, XIE Xinrui, HUANG Qicheng. A visual SLAM method for merging semantic information in indoor dynamic scenes[J]. Bulletin of Surveying and Mapping, 2025, 0(4): 9-13.

References

[1] 刘宇飞,冯楚乔,陈伟乐,等.基于机器视觉法的桥梁表观病害检测研究综述[J].中国公路学报,2024,37(2):1-15.
[2] 尹鋆泰.动态场景下基于深度学习的视觉SLAM技术研究[D].北京:北京邮电大学,2023.
[3] 吴磊,郭斌,徐若楠,等.泛在计算视角下的群智模块化机器人[J].中国科学(信息科学),2023,53(11):2107-2151.
[4] MUR-ARTAL R,TARDÓS J D.ORB-SLAM2: an open-source SLAM system for monocular,stereo,and RGB-D cameras[J].IEEE Transactions on Robotics,2017,33(5): 1255-1262.
[5] FANG Y Q,DAI B.An improved moving target detecting and tracking based on Optical Flow technique and Kalman filter[C]//Proceedings of the 4th International Conference on Computer Science ＆ Education.Nanning:IEEE,2009: 1197-1202.
[6] BAKKAY M C,ARAFA M,ZAGROUBA E.Dense 3D SLAM in dynamic scenes using kinect[M]//Lecture Notes in Computer Science.Cham:Springer International Publishing,2015:121-129.
[7] 徐晓苏,安仲帅.基于深度学习的室内动态场景下的VSLAM方法[J].中国惯性技术学报,2020,28(4): 480-486.
[8] 杨诒斌,王俊强,柴世豪.基于CNN的智慧农场图像分类方法[J].电子技术应用,2023,49(4): 33-38.
[9] 范迎春.动态环境下的视觉SLAM地图构建研究[D].西安: 西安电子科技大学,2021.
[10] LIU G H,ZENG W L,FENG B,et al.DMS-SLAM: a general visual SLAM system for dynamic scenes with multiple sensors[J].Sensors (Basel,Switzerland),2019,19(17): 3714.
[11] REZATOFIGHI H,TSOI N,GWAK J,et al.Generalized intersection over union: a metric and a loss for bounding box regression[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach,CA:IEEE,2019: 658-666.
[12] PAPADOPOULOS D P,UIJLINGS J R R,KELLER F,et al.Extreme clicking for efficient object annotation[C]//Proceedings of 2017 IEEE International Conference on Computer Vision.Venice:IEEE,2017:4940-4949.
[13] 魏利胜,李明.基于深度学习的工业产品表面缺陷检测综述[J].绥化学院学报,2024,44(6): 151-156.
[14] 李星驰.智能工业运载车的轨迹跟踪算法与系统[D].广州: 广东工业大学,2021.
[15] SCHUBERT D,GOLL T,DEMMEL N,et al.The TUM VI benchmark for evaluating visual-inertial odometry[C]//Proceedings of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems.Madrid:IEEE,2018: 1680-1687.
[16] ZHANG Y,GAO X.Semantic SLAM building based on instance segmentation and optical flow in dynamic scenes[J].Microelectronics ＆ Computer,2024,41(2):19-27.