测绘通报 ›› 2025, Vol. 0 ›› Issue (7): 73-79.doi: 10.13474/j.cnki.11-2246.2025.0712

• 学术研究 • 上一篇    下一篇

语义掩膜识别增强视觉全局定位

李杰1, 尹斐2, 柳景斌2, 李孟祥3, 张伟4   

  1. 1. 上海勘测设计研究院有限公司, 上海 200434;
    2. 武汉大学测绘遥感信息工程国家重点实验室, 湖北 武汉 430079;
    3. 测绘遥感信息工程国家重点实验室深圳研发中心, 广东 深圳 518057;
    4. 武汉捷探科技有限公司, 湖北 武汉 430022
  • 收稿日期:2024-12-25 发布日期:2025-08-02
  • 作者简介:李杰(1984—),男,硕士,高级工程师,主要从事智能感知与导航技术及应用研究。E-mail:li_jie14@ctg.com.cn
  • 基金资助:
    国家自然科学基金(42474060);深圳市科技计划(JCYJ20210324123611032);湖北省自然科学基金(2024AFD403);武汉市重点研发计划人工智能创新专项(2023010402040029)

Semantic mask segmentation enhanced indoor visual global localization

LI Jie1, YIN Fei2, LIU Jingbin2, LI Mengxiang3, ZHANG Wei4   

  1. 1. Shanghai Investigation, Design&Research Institute Co., Ltd., Shanghai 200434, China;
    2. State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China;
    3. Shenzhen R&D Center of State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Shenzhen 518057, China;
    4. Wuhan Geo-detection Technology Co., Ltd., Wuhan 430022, China
  • Received:2024-12-25 Published:2025-08-02

摘要: 近年来,随着计算机视觉领域在图像处理、特征提取等方面新技术的涌现,基于视觉的室内定位受到了广泛关注。不同于视觉里程计的相对定位方法,视觉全局定位将图像与视觉地图融合,提供具有地理参考基准的绝对位姿。室内的结构化场景存在着各种可移动目标和大量重复相似纹理,这些特点都对室内环境下的视觉定位提出了挑战。本文提出了一种基于语义掩膜识别并剔除的室内全局定位方法,采用全景分割方法识别场景中的移动目标和重复纹理区域,建立环境语义掩膜,消除它们对图像建库和匹配定位的影响,提高视觉全局定位的精度和稳健性。试验结果表明,本文方法重建的三维点云噪点更少、室内结构更清晰完整,定位精度达到0.84 m()。

关键词: 室内定位, 视觉定位, 语义掩膜, 运动恢复结构, 全局定位

Abstract: In recent years,with the emergence of new technologies in the field of computer vision in image processing and feature extraction,visual indoor positioning has attracted extensive attention.Different from the relative positioning method of visual odometry,visual global positioning methods fuse images with the visual map to provide absolute pose with geographic reference data.There are various movable targets and a large number of repetitive similar textures in indoor structured scenes,which pose a challenge to visual positioning in indoor environment.In this paper,an indoor positioning method combining structure from motion (SfM) and panoptic segmentation is proposed.An environmental semantic mask is designed to remove the interference of movable targets and repeated textures in the image,and improve the accuracy and reliability of global positioning.The experimental results show that the SfM method with semantic mask can reconstruct 3D point clouds with less noise and clearer indoor structure,the positioning accuracy reaches 0.84 m(1σ).

Key words: indoor positioning, visual positioning, semantic mask, SfM, global localization

中图分类号: