单目视觉驱动的机器人实时高精度稠密场景重建算法

doi:10.13474/j.cnki.11-2246.2025.1012

测绘通报 ›› 2025, Vol. 0 ›› Issue (10): 71-75,137.doi: 10.13474/j.cnki.11-2246.2025.1012

• 学术研究 • 上一篇

单目视觉驱动的机器人实时高精度稠密场景重建算法

蒋祥龙¹, 邓文亮¹, 何胜喜^2,3

1. 重庆科创职业学院智能制造与机器人学院, 重庆 402160;
2. 重庆大学机械传动国家重点实验室, 重庆 400044;
3. 重庆长安汽车股份有限公司, 重庆 400023

收稿日期:2025-03-13 发布日期:2025-10-31
作者简介:蒋祥龙(1983-),男,硕士,副教授,主要研究方向为电气工程及自动化。E-mail:jiangxianglong1983@163.com
基金资助:
重庆市教委科学技术研究计划(KJQN202505404)

Monocular vision-driven real-time high-precision dense scene reconstruction algorithm for robots

JIANG Xianglong¹, DENG Wenliang¹, HE Shengxi^2,3

1. College of Intelligent Manufacturing and Robotics, Chongqing College of Science and Creation, Chongqing 402160, China;
2. State Key Laboratory of Mechanical Transmission, Chongqing University, Chongqing 400044, China;
3. Chongqing Changan Automobile Co., Ltd., Chongqing 400023, China

Received:2025-03-13 Published:2025-10-31

摘要/Abstract

摘要： 本文提出一种单目视觉驱动的机器人实时高精度稠密场景重建算法,该算法基于深度稠密单目视觉SLAM和快速不确定性传播技术,从图像中重建三维场景。该算法能够实现场景的稠密、准确和实时三维重建,并对来自单目视觉SLAM的极端噪声深度估计具有良好的稳健性。与传统通过特殊深度滤波器或从RGB-D传感器模型中估计深度不确定性的方法不同,本文方法直接利用SLAM中底层束平差问题的信息矩阵生成概率深度不确定性。这种深度不确定性为体积融合的深度图加权提供了关键信号。本文方法能够生成更加精确且伪影显著减少的三维网格,并在具有挑战性的Euroc数据集上进行了试验验证。结果表明,相比直接从单目视觉SLAM中融合深度的方式,本文方法在建图准确率上提升了85%。

关键词: 单目视觉, 稠密重建, SLAM, 深度不确定性, 机器人

Abstract: This paper proposes a monocular vision-driven real-time high-precision dense scene reconstruction algorithm for robots, based on deep dense monocular visual SLAM and rapid uncertainty propagation techniques to reconstruct 3D scenes from images.The algorithm achieves dense, accurate, and real-time 3D scene reconstruction while demonstrating robustness against extreme noise in depth estimation from monocular visual SLAM.Unlike traditional methods that rely on specialized depth filters or estimate depth uncertainty from RGB-D sensor models, this approach directly utilizes the information matrix from the underlying bundle adjustment problem in SLAM to generate probabilistic depth uncertainty.This depth uncertainty provides a critical signal for weighting depth maps during volumetric fusion.Our method produces more precise 3D meshes with significantly reduced artifacts.Experimental validation on the challenging Euroc dataset shows that compared to methods that directly fuse depths from monocular visual SLAM improves mapping accuracy by 85%.

Key words: monocular vision, dense reconstruction, SLAM, depth uncertainty, robotics

中图分类号:

蒋祥龙, 邓文亮, 何胜喜. 单目视觉驱动的机器人实时高精度稠密场景重建算法[J]. 测绘通报, 2025, 0(10): 71-75,137.

JIANG Xianglong, DENG Wenliang, HE Shengxi. Monocular vision-driven real-time high-precision dense scene reconstruction algorithm for robots[J]. Bulletin of Surveying and Mapping, 2025, 0(10): 71-75,137.

参考文献

[1] KLEIN G,MURRAY D. Parallel tracking and mapping for small AR workspaces[C] //Proceedings of 2007 IEEE and ACM International Symposium on Mixed and Augmented Reality. Nara:IEEE,2007:1-10. [2] NEWCOMBE R A,LOVEGROVE S J,DAVISON A J. DTAM:dense tracking and mapping in real-time[C] //Proceedings of 2011 International Conference on Computer Vision. Barcelona:IEEE,2011:2320-2327. [3] BLOESCH M,CZARNOWSKI J,CLARK R,et al. Codeslam—learning a compact,optimisable representationfor dense visual slam[C] //Proceedings of 2018 IEEE Conference on Computer Vision and Pattern Recognition. [S. l. ]:IEEE,2018:2560-2568. [4] KOESTLER L,YANG N,ZELLER N,et al. Tandem:tracking and dense mapping in real-time using deep multi-view stereo[C] //Proceedings of 2022 IEEE Conference Computer Vision and Pattern Recognition. [S. l. ]:IEEE,2022:34-45. [5] TEED Z,DENG J. Droid-slam:deep visual slam for monocular,stereo,and rgb-d cameras[J]. Advances in Neural Information Processing Systems,2021,34:16558-16569. [6] BURRI M,NIKOLIC J,GOHL P,et al. The EuRoC micro aerial vehicle datasets[J]. The International Journal of Robotics Research,2016,35(10):1157-1163. [7] [WANG Wenshan,ZHU Delong,WANG Xiangwei,et al. TartanAir:a dataset to push the limits of visual SLAM[C] //Proceedings of 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Las Vegas,NV:IEEE,2020:4909-4916. [8] SUCAR E,LIU Shikun,ORTIZ J,et al. iMAP:implicit mapping and positioning in real-time[C] //Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal,QC:IEEE,2021:6209-6218. [9] ZHU Zihan,PENG Songyou,LARSSON V,et al. NICE-SLAM:neural implicit scalable encoding for SLAM[C] //Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans,LA:IEEE,2022:12776-12786. [10] TEED Z,DENG Jia. RAFT:recurrent all-pairs field transforms for optical flow[M] //Computer Vision - ECCV 2020. Cham:Springer International Publishing,2020:402-419. [11] ILA V,POLOK L,SOLONY M,et al. Fast incremental bundle adjustment with covariance recovery[C] //Proceedings of 2017 International Conference on 3D Vision (3DV). Qingdao:IEEE,2017:175-184. [12] RUSU R B,COUSINS S. 3D is here:point cloud library (PCL)[C] //Proceedings of 2011 IEEE International Conference on Robotics and Automation. Shanghai:IEEE,2011:1-4. [13] CURLESS B,LEVOY M. A volumetric method for building complex models from range images[C] //Proceedings of the 23th Annual Conference on Computer Graphics and Interactive Techniques. [S. l. ]:ACM Press,1996:303-312. [14] ZHOU Q Y,PARK J,KOLTUN V. Open3D:a modern library for 3D data processing[J]. Computer Vision and Pattern Recognition,2018:1801. 09847. [15] ROSINOL A,ABATE M,CHANG Yun,et al. Kimera:an open-source library for real-time metric-semantic localization and mapping[C] //Proceedings of 2020 IEEE International Conference on Robotics and Automation (ICRA). Paris:IEEE,2020:1689-1696. [16] YAO Yao,LUO Zixin,LI Shiwei,et al. MVSNet:depth inference for unstructured multi-view stereo[C] //Proceedings of 2018 Computer Vision. Cham:Springer,2018:785-801.

单目视觉驱动的机器人实时高精度稠密场景重建算法

Monocular vision-driven real-time high-precision dense scene reconstruction algorithm for robots

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	何婷婷, 蒋祥龙, 何胜喜. 融合几何特征和语义信息的机器人动态场景视觉SLAM算法[J]. 测绘通报, 2025, 0(9): 64-69.
[2]	汪志刚, 冯凯, 刘懿晨, 刘辉, 张文俊. 融合LightGlue的双目视觉SLAM算法[J]. 测绘通报, 2025, 0(9): 168-172.
[3]	郭震冬, 吴昊, 顾正东, 黄亮. 洞内外实景三维数据采集与融合技术——以宜兴善卷洞为例[J]. 测绘通报, 2025, 0(8): 149-152.
[4]	宋维凯, 柯福阳, 魏民, 黄钰洲, 朱尚峻. 非接触式单目视觉高精度自动沉降监测[J]. 测绘通报, 2025, 0(5): 145-151.
[5]	朱松, 陈安泰, 华远盛, 姜文宇, 元鹏鹏, 朱家松. 面向漂流胶囊机器人图像的管道缺陷检测[J]. 测绘通报, 2025, 0(4): 34-38.
[6]	王一哲, 张瑞菊, 王坚, 谢欣睿, 黄启承. 室内动态场景下融合语义信息的视觉SLAM方法[J]. 测绘通报, 2025, 0(4): 9-13.
[7]	符强, 钟振, 纪元法, 任风华. 基于动态场景的实时语义SLAM算法[J]. 测绘通报, 2025, 0(4): 27-33.
[8]	范启亮, 丁度坤. 基于多特征信息定位的机器人视觉SLAM算法[J]. 测绘通报, 2025, 0(4): 14-19,26.
[9]	王益美, 黄琰, 冯浩. 基于稀疏直接法的水下单目视觉惯性里程计[J]. 测绘通报, 2025, 0(1): 94-100.
[10]	王智, 薛慧艳, 薛晓. SLAM扫描点云用于异形建筑房产面积计算的方法[J]. 测绘通报, 2025, 0(1): 180-184.
[11]	许哲, 董林啸, 吴家跃. 改进ICP算法的激光雷达点云配准[J]. 测绘通报, 2024, 0(4): 1-5.
[12]	白云鹏, 徐会希, 吕凤天. 水下视觉SLAM分段式光束平差算法[J]. 测绘通报, 2024, 0(11): 7-12.
[13]	朱伟刚, 赵天恺, 刘明阳, 金圣博, 蒋少华. 基于SLAM技术的钢结构挠度检测方法[J]. 测绘通报, 2024, 0(10): 179-182.
[14]	符强, 腾先云, 纪元法, 任风华, 孔健明. 基于双目深度筛选的ORB-SLAM3算法[J]. 测绘通报, 2024, 0(1): 77-82.
[15]	吴思齐, 刘飞, 白羽, 马运涛, 王斐, 郭梓钰. 室外眩光场景ORB-SLAM2稳健定位模型研究[J]. 测绘通报, 2023, 0(9): 59-63.