测绘通报 ›› 2025, Vol. 0 ›› Issue (4): 27-33.doi: 10.13474/j.cnki.11-2246.2025.0405

• 人工智能与视觉系统 • 上一篇    

基于动态场景的实时语义SLAM算法

符强1,2, 钟振1,2, 纪元法1,2,3,4, 任风华1,2   

  1. 1. 桂林电子科技大学广西精密导航技术与应用重点实验室, 广西 桂林 541004;
    2. 桂林电子科技大学 信息与通信学院, 广西 桂林 541004;
    3. 时空信息与智能位置服务国际合作联合实验室, 广西 桂林 541004;
    4. 广西产研院时空信息技术研究所有限公司, 广西 南宁 530023
  • 收稿日期:2024-08-01 发布日期:2025-04-28
  • 通讯作者: 任风华。E-mail:643749271@qq.com
  • 作者简介:符强(1976—),男,硕士,正高级实验师,主要研究方向为图像处理、卫星导航和定位。E-mail:2310396462@qq.com
  • 基金资助:
    国家自然科学基金(U23A20280;62161007;62061010);广西科技厅项目(桂科AD22080061;桂科AA23062038;桂科AB23026120;桂科ZY22096026)

Real-time semantic SLAM algorithm based on dynamic scenes

FU Qiang1,2, ZHONG Zhen1,2, JI Yuanfa1,2,3,4, REN Fenghua1,2   

  1. 1. Guangxi Key Laboratory of Precision Navigation Technology and Application, Guilin University of Electronic Technology, Guilin 541004, China;
    2. School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China;
    3. Joint Laboratory for International Cooperation on Spatio-Temporal Information and Intelligent Location Services, Guilin 541004, China;
    4. Guangxi Institute of Industry and Research Spatio-Temporal Information Technology Co., Ltd., Nanning 530023, China
  • Received:2024-08-01 Published:2025-04-28

摘要: 针对传统视觉SLAM(simultaneous localization and mapping)在动态环境下定位精度较低、稳健性较差、结合深度学习后实时性较差及无法构建稠密地图的问题,本文提出了一种基于ORB-SLAM3的改进算法。首先,采用轻量化SegFormer语义分割网络,对图像中存在的动态物体进行识别后,添加掩膜图像自适应膨胀方法,根据特征点数自动调整掩膜膨胀范围,更有效地保留静态特征点及去除潜在动态特征点;然后,改进词袋模型,提升算法的加载和匹配速度;最后,添加稠密建图线程,根据掩膜信息和关键帧,构建去除动态特征后的稠密点云地图。试验结果表明,该算法在动态场景下能够有效地剔除动态物体特征点,提高了系统的定位精度和稳健性,平均处理速度为20 帧/s,基本满足实时运行的要求。

关键词: 视觉SLAM, ORB-SLAM3, 语义分割, 稠密建图

Abstract: Aiming at the problems of traditional visual SLAM (simultaneous localization and mapping) in indoor dynamic environment with low localization accuracy, poor robustness and poor real-time performance after combining with deep learning as well as the inability to construct dense maps, this paper proposes an improved algorithm based on ORB-SLAM3. First, a lightweight SegFormer semantic segmentation network is used to identify dynamic objects present in the image, and then a mask image adaptive expansion method is added to automatically adjust the mask expansion range according to the total number of feature points to more effectively remove potential dynamic object feature points; second, the bag-of-words model is improved to enhance the loading and matching speed of the algorithm; and lastly, a dense map building thread is added to construct a map for removing dynamic features according to the mask information and keyframes to construct a dense point cloud map after removing dynamic features. The experimental results show that the algorithm in this paper can effectively remove dynamic object feature points in highly dynamic scenes, and improve the localization accuracy and robustness of the system,and the average processing speed is 20.3FPS, which basically meets the requirements of real-time operation.

Key words: VSLAM, ORB-SLAM3, semantic segmentation, dense mapping

中图分类号: