测绘通报 ›› 2025, Vol. 0 ›› Issue (7): 66-72.doi: 10.13474/j.cnki.11-2246.2025.0711

• 学术研究 • 上一篇    下一篇

GDS:无人机图像引导的跨视角图像地理定位

席泽馨, 李佳仪, 谢昊, 甘文建, 周杨   

  1. 信息工程大学地理空间信息学院, 河南 郑州 450001
  • 收稿日期:2024-11-11 发布日期:2025-08-02
  • 通讯作者: 周杨。E-mail:zhouyang3d@163.com
  • 作者简介:席泽馨(2004—),女,主要研究方向为无人机图像处理及应用。E-mail:xizexin123@163.com

GDS:drone image-guided cross-view image geographic positioning

XI Zexin, LI Jiayi, XIE Hao, GAN Wenjian, ZHOU Yang   

  1. Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China
  • Received:2024-11-11 Published:2025-08-02

摘要: 跨视角图像地理定位是指将地理坐标未知的地面视角图像与具有高精度空间坐标信息的基准卫星图像进行检索匹配,从而确定地面视角图像地理坐标的方法。当待定位地面视角图像和基准卫星图像之间的视角差异过大时,会导致检索匹配困难,因此,本文提出了一种无人机图像引导的跨视角图像地理定位方法(GDS)。该方法利用低空无人机倾斜摄影图像作为过渡,首先将待定位的地面视角图像与无人机图像进行检索匹配,然后将检索得到的无人机图像与具有精确地理坐标的卫星图像进行检索匹配,从而确定地面视角图像的地理位置。本文使用了基于卷积神经网络和Vision Transformer的ConvNeXt模型进行图像的特征提取,并通过使用InfoNCE损失作为训练目标进行对比学习,提高了图像查询的精度,同时采用随机采样的策略打乱并随机移除一小部分训练样本,提高了模型的泛化能力。在通用的跨视角数据集University-1652上的试验结果表明,本文方法在召回率指标上优于地面视角图像直接检索卫星图像的方法。本文方法使用地面视角查询无人机视角图像的精度为11.63%Recall@1,使用无人机视角查询卫星视角图像的精度为91.49%Recall@1,综合使用两段式检索方法从地面视角图像查询卫星视角图像的精度达10.64%Recall@1,与地面视角图像直接检索卫星图像的5.23%Recall@1相比,有了较大提升,具有有效性和先进性。

关键词: 图像地理定位, 跨视角, 无人机图像, 卫星图像, ConvNeXt, InfoNCE损失

Abstract: Cross-view image geographic positioning refers to the method of matching the ground-view image with unknown geographic coordinates with the reference satellite image with high precision spatial coordinate information,so as to determine the geographical coordinates of the ground-view image.Due to the large difference in viewing angle between the unpositioned ground viewing angle image and the reference Satellite image,it is difficult to retrieve and match.In this paper,a UAV image-guided cross-viewing angle geographic positioning method ground-drone-satellite(GDS) is proposed,which uses the tilting photographic image of low-altitude UAV as a transition.Firstly,the unpositioned ground view image is matched with the UAV image,and then the retrieved UAV image is matched with the satellite image with accurate geographic coordinates,so as to determine the geographical position of the ground view image.In this paper,the ConvNeXt model based on convolutional neural network and Vision Transformer is used to extract image features,and InfoNCE loss is used as the training target for comparative learning,which improves the accuracy of image query.Meanwhile,random sampling strategy is adopted to disrupt and randomly remove a small part of training samples.The generalization ability of the model is improved.Experimental results on University-1652,a universal cross-view data set,show that the proposed method is superior to the method for retrieving satellite images directly from ground-view images in terms of Recall and average accuracy AP.In this paper,the accuracy of querying UAV view images from the ground perspective is 11.63%Recall@1,and the accuracy of querying satellite view images from the UAV view is 91.49%Recall@1.The two-stage retrieval method is comprehensively used to query satellite view images from the ground view images,and the accuracy reaches 10.64%Recall@1.Compared with 5.23%Recall@1 in the direct retrieval of satellite images from the ground perspective,this is a great improvement,which verifies the effectiveness and advancement of the proposed method.

Key words: image geo-localization, cross-view, drone images, satellite images, ConvNeXt, InfoNCE loss

中图分类号: