基于动态场景的实时语义SLAM算法

doi:10.13474/j.cnki.11-2246.2025.0405

测绘通报 ›› 2025, Vol. 0 ›› Issue (4): 27-33.doi: 10.13474/j.cnki.11-2246.2025.0405

• 人工智能与视觉系统 • 上一篇

基于动态场景的实时语义SLAM算法

符强^1,2, 钟振^1,2, 纪元法^1,2,3,4, 任风华^1,2

1. 桂林电子科技大学广西精密导航技术与应用重点实验室, 广西桂林 541004;
2. 桂林电子科技大学信息与通信学院, 广西桂林 541004;
3. 时空信息与智能位置服务国际合作联合实验室, 广西桂林 541004;
4. 广西产研院时空信息技术研究所有限公司, 广西南宁 530023

收稿日期:2024-08-01 发布日期:2025-04-28
通讯作者: 任风华。E-mail:643749271@qq.com
作者简介:符强(1976—),男,硕士,正高级实验师,主要研究方向为图像处理、卫星导航和定位。E-mail:2310396462@qq.com
基金资助:
国家自然科学基金(U23A20280;62161007;62061010);广西科技厅项目(桂科AD22080061;桂科AA23062038;桂科AB23026120;桂科ZY22096026)

Real-time semantic SLAM algorithm based on dynamic scenes

FU Qiang^1,2, ZHONG Zhen^1,2, JI Yuanfa^1,2,3,4, REN Fenghua^1,2

1. Guangxi Key Laboratory of Precision Navigation Technology and Application, Guilin University of Electronic Technology, Guilin 541004, China;
2. School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China;
3. Joint Laboratory for International Cooperation on Spatio-Temporal Information and Intelligent Location Services, Guilin 541004, China;
4. Guangxi Institute of Industry and Research Spatio-Temporal Information Technology Co., Ltd., Nanning 530023, China

Received:2024-08-01 Published:2025-04-28

摘要/Abstract

摘要： 针对传统视觉SLAM(simultaneous localization and mapping)在动态环境下定位精度较低、稳健性较差、结合深度学习后实时性较差及无法构建稠密地图的问题,本文提出了一种基于ORB-SLAM3的改进算法。首先,采用轻量化SegFormer语义分割网络,对图像中存在的动态物体进行识别后,添加掩膜图像自适应膨胀方法,根据特征点数自动调整掩膜膨胀范围,更有效地保留静态特征点及去除潜在动态特征点;然后,改进词袋模型,提升算法的加载和匹配速度;最后,添加稠密建图线程,根据掩膜信息和关键帧,构建去除动态特征后的稠密点云地图。试验结果表明,该算法在动态场景下能够有效地剔除动态物体特征点,提高了系统的定位精度和稳健性,平均处理速度为20 帧/s,基本满足实时运行的要求。

关键词: 视觉SLAM, ORB-SLAM3, 语义分割, 稠密建图

Abstract: Aiming at the problems of traditional visual SLAM (simultaneous localization and mapping) in indoor dynamic environment with low localization accuracy, poor robustness and poor real-time performance after combining with deep learning as well as the inability to construct dense maps, this paper proposes an improved algorithm based on ORB-SLAM3. First, a lightweight SegFormer semantic segmentation network is used to identify dynamic objects present in the image, and then a mask image adaptive expansion method is added to automatically adjust the mask expansion range according to the total number of feature points to more effectively remove potential dynamic object feature points; second, the bag-of-words model is improved to enhance the loading and matching speed of the algorithm; and lastly, a dense map building thread is added to construct a map for removing dynamic features according to the mask information and keyframes to construct a dense point cloud map after removing dynamic features. The experimental results show that the algorithm in this paper can effectively remove dynamic object feature points in highly dynamic scenes, and improve the localization accuracy and robustness of the system,and the average processing speed is 20.3FPS, which basically meets the requirements of real-time operation.

Key words: VSLAM, ORB-SLAM3, semantic segmentation, dense mapping

中图分类号:

符强, 钟振, 纪元法, 任风华. 基于动态场景的实时语义SLAM算法[J]. 测绘通报, 2025, 0(4): 27-33.

FU Qiang, ZHONG Zhen, JI Yuanfa, REN Fenghua. Real-time semantic SLAM algorithm based on dynamic scenes[J]. Bulletin of Surveying and Mapping, 2025, 0(4): 27-33.

参考文献

[1] 赵洋,刘国良,田国会,等. 基于深度学习的视觉SLAM综述[J].机器人,2017,39(6):889-896.
[2] MUR-ARTAL R,TARDÓS J D.ORB-SLAM2:an open-source SLAM system for monocular,stereo,and RGB-D cameras[J].IEEE Transactions on Robotics,2017,33(5):1255-1262.
[3] CAMPOS C,ELVIRA R,RODRÍGUEZ J J G,et al.ORB-SLAM3:an accurate open-source library for visual,visual-inertial,and multimap SLAM[J].IEEE Transactions on Robotics,2021,37(6):1874-1890.
[4] ROSTEN E,PORTER R,DRUMMOND T.Faster and better:a machine learning approach to corner detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2010,32(1):105-119.
[5] CALONDER M,LEPETIT V,ÖZUYSAL M,et al.BRIEF:computing a local binary descriptor very fast[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,34(7):1281-1298.
[6] PALAZZOLO E,BEHLEY J,LOTTES P,et al.ReFusion:3D reconstruction in dynamic environments for RGB-D cameras exploiting residuals[C]//Proceedings of 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems.[S.l.]:IEEE,2019:7855-7862.
[7] LI S L,LEE D.RGB-D SLAM in dynamic environments using static point weighting[J].IEEE Robotics and Automation Letters,2017,2(4):2263-2270.
[8] YU C,LIU Z X,LIU X J,et al.DS-SLAM:a semantic visual SLAM towards dynamic environments[C]//Proceedings of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems.Madrid:IEEE,2018:1168-1174.
[9] BADRINARAYANAN V,KENDALL A,CIPOLLA R.SegNet:a deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495.
[10] BESCOS B,FÁCIL J M,CIVERA J,et al.DynaSLAM:tracking,mapping,and inpainting in dynamic scenes[J].IEEE Robotics and Automation Letters,2018,3(4):4076-4083.
[11] HE K M,GKIOXARI G,DOLLÁR P,et al.Mask R-CNN[C]//Proceedings of 2017 IEEE International Conference on Computer Vision.Venice:IEEE,2017:2980-2988.
[12] CHENG S H,SUN C H,ZHANG S J,et al.SG-SLAM:a real-time RGB-D visual SLAM toward dynamic scenes with semantic and geometric information[J].IEEE Transactions on Instrumentation and Measurement,2023,72:7501012.
[13] XIE E Z,WANG W H,YU Z D,et al.SegFormer:simple and efficient design for semantic segmentation with transformers[C]//Proceedings of the 35th International Conference on Neural Information Processing Systems.[S.l.]:ACM Press,2021.
[14] VASWANI A,SHAZEER N M,PARMAR N,et al.Attention is All you Need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.[S.l.]:ACM Press,2017.
[15] TOLSTIKHIN I,HOULSBY N,KOLESNIKOV A,et al.MLP-Mixer:An all-MLP Architecture for Vision[EB/OL].2021-05-04[2024-07-02].https://arxiv.org/abs/2105.01601.
[16] QADER W A,AMEEN M M,AHMED B I.An overview of bag of Words;Importance,implementation,applications,and challenges[C]//Proceedings of 2019 International Engineering Conference.Erbil:IEEE,2019:200-204.
[17] STURM J,ENGELHARD N,ENDRES F,et al.A benchmark for the evaluation of RGB-D SLAM systems[C]//Proceedings of 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.Vilamoura-Algarve:IEEE,2012:573-580.

基于动态场景的实时语义SLAM算法

Real-time semantic SLAM algorithm based on dynamic scenes

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	王一哲, 张瑞菊, 王坚, 谢欣睿, 黄启承. 室内动态场景下融合语义信息的视觉SLAM方法[J]. 测绘通报, 2025, 0(4): 9-13.
[2]	范启亮, 丁度坤. 基于多特征信息定位的机器人视觉SLAM算法[J]. 测绘通报, 2025, 0(4): 14-19,26.
[3]	赵兴旺, 赵妍, 刘超, 刘春阳. 基于改进的DeepLabV3+网络的Sentinel-1影像水体提取[J]. 测绘通报, 2025, 0(3): 66-70.
[4]	苟长龙, 庞敏, 杨扬. 改进的U-Net卷积网络在遥感影像地物分类中的应用[J]. 测绘通报, 2025, 0(3): 150-155.
[5]	张伟, 张朝龙, 王本林, 蔡安宁. 基于深度学习的多尺度无人机遥感图像道路提取[J]. 测绘通报, 2024, 0(6): 77-81.
[6]	王鸿雁, 车向红, 徐辛超, 徐胜华, 李洪胜. 利用DeepLabv3+模型提取分析街景图像绿视率——以北京三环内为例[J]. 测绘通报, 2024, 0(3): 88-94.
[7]	黄应华, 董振川, 李昊, 陈壮, 刘长睿, 张献州. 城市竣工测绘典型要素语义分割PointNet++深度学习模型适用性分析[J]. 测绘通报, 2024, 0(2): 85-89.
[8]	黄金彩, 李诗逸, 石岩. 利用行车记录仪视频提取路面车道线[J]. 测绘通报, 2024, 0(12): 1-5.
[9]	李锋, 薛梅, 詹勇, 杨元. 基于分割一切模型SAM的实景三维场景语义分割[J]. 测绘通报, 2024, 0(12): 101-105.
[10]	白云鹏, 徐会希, 吕凤天. 水下视觉SLAM分段式光束平差算法[J]. 测绘通报, 2024, 0(11): 7-12.
[11]	王星, 倪欢. 遥感图像农田识别的跨类别小样本分割方法[J]. 测绘通报, 2024, 0(10): 77-83.
[12]	孙常建, 尚永福, 王石岩, 窦小楠. 基于语义分割网络的冬小麦遥感分类及变化分析[J]. 测绘通报, 2024, 0(10): 151-156.
[13]	符强, 腾先云, 纪元法, 任风华, 孔健明. 基于双目深度筛选的ORB-SLAM3算法[J]. 测绘通报, 2024, 0(1): 77-82.
[14]	吴思齐, 刘飞, 白羽, 马运涛, 王斐, 郭梓钰. 室外眩光场景ORB-SLAM2稳健定位模型研究[J]. 测绘通报, 2023, 0(9): 59-63.
[15]	邹鑫, 王磊, 李靖宇, 滕超群, 黄金中, 李忠, 李世保. 融合Deep-ResUnet和PS-InSAR的城市道路网形变灾害监测方法——以合肥市为例[J]. 测绘通报, 2023, 0(8): 67-71.