GDS:无人机图像引导的跨视角图像地理定位

doi:10.13474/j.cnki.11-2246.2025.0711

摘要/Abstract

摘要： 跨视角图像地理定位是指将地理坐标未知的地面视角图像与具有高精度空间坐标信息的基准卫星图像进行检索匹配,从而确定地面视角图像地理坐标的方法。当待定位地面视角图像和基准卫星图像之间的视角差异过大时,会导致检索匹配困难,因此,本文提出了一种无人机图像引导的跨视角图像地理定位方法(GDS)。该方法利用低空无人机倾斜摄影图像作为过渡,首先将待定位的地面视角图像与无人机图像进行检索匹配,然后将检索得到的无人机图像与具有精确地理坐标的卫星图像进行检索匹配,从而确定地面视角图像的地理位置。本文使用了基于卷积神经网络和Vision Transformer的ConvNeXt模型进行图像的特征提取,并通过使用InfoNCE损失作为训练目标进行对比学习,提高了图像查询的精度,同时采用随机采样的策略打乱并随机移除一小部分训练样本,提高了模型的泛化能力。在通用的跨视角数据集University-1652上的试验结果表明,本文方法在召回率指标上优于地面视角图像直接检索卫星图像的方法。本文方法使用地面视角查询无人机视角图像的精度为11.63%Recall@1,使用无人机视角查询卫星视角图像的精度为91.49%Recall@1,综合使用两段式检索方法从地面视角图像查询卫星视角图像的精度达10.64%Recall@1,与地面视角图像直接检索卫星图像的5.23%Recall@1相比,有了较大提升,具有有效性和先进性。

关键词: 图像地理定位, 跨视角, 无人机图像, 卫星图像, ConvNeXt, InfoNCE损失

Abstract: Cross-view image geographic positioning refers to the method of matching the ground-view image with unknown geographic coordinates with the reference satellite image with high precision spatial coordinate information,so as to determine the geographical coordinates of the ground-view image.Due to the large difference in viewing angle between the unpositioned ground viewing angle image and the reference Satellite image,it is difficult to retrieve and match.In this paper,a UAV image-guided cross-viewing angle geographic positioning method ground-drone-satellite(GDS) is proposed,which uses the tilting photographic image of low-altitude UAV as a transition.Firstly,the unpositioned ground view image is matched with the UAV image,and then the retrieved UAV image is matched with the satellite image with accurate geographic coordinates,so as to determine the geographical position of the ground view image.In this paper,the ConvNeXt model based on convolutional neural network and Vision Transformer is used to extract image features,and InfoNCE loss is used as the training target for comparative learning,which improves the accuracy of image query.Meanwhile,random sampling strategy is adopted to disrupt and randomly remove a small part of training samples.The generalization ability of the model is improved.Experimental results on University-1652,a universal cross-view data set,show that the proposed method is superior to the method for retrieving satellite images directly from ground-view images in terms of Recall and average accuracy AP.In this paper,the accuracy of querying UAV view images from the ground perspective is 11.63%Recall@1,and the accuracy of querying satellite view images from the UAV view is 91.49%Recall@1.The two-stage retrieval method is comprehensively used to query satellite view images from the ground view images,and the accuracy reaches 10.64%Recall@1.Compared with 5.23%Recall@1 in the direct retrieval of satellite images from the ground perspective,this is a great improvement,which verifies the effectiveness and advancement of the proposed method.

Key words: image geo-localization, cross-view, drone images, satellite images, ConvNeXt, InfoNCE loss

中图分类号:

P237

席泽馨, 李佳仪, 谢昊, 甘文建, 周杨. GDS:无人机图像引导的跨视角图像地理定位[J]. 测绘通报, 2025, 0(7): 66-72.

XI Zexin, LI Jiayi, XIE Hao, GAN Wenjian, ZHOU Yang. GDS:drone image-guided cross-view image geographic positioning[J]. Bulletin of Surveying and Mapping, 2025, 0(7): 66-72.

参考文献

[1] 周杨,徐青,罗向阳,等.网络空间测绘的概念及其技术体系的研究[J].计算机科学,2018,45(5): 1-7.
[2] 施群山,蓝朝桢,徐青,等.面向卫星遥感影像检索定位的深度学习全局表征模型评估与分析[J].地球信息科学学报,2022,24(11): 2245-2263.
[3] 黄高爽,周杨,胡校飞,等.图像地理定位研究进展[J].地球信息科学学报,2023,25(7): 1336-1362.
[4] SHI Y,YU X,LIU L,et al.Accurate 3-DoF camera geo-localization via ground-to-satellite image matching[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(3): 2682-2697.
[5] GOODFELLOW I,ABADIE J,MIRZA M,et al.Conditional generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems.Montréal:[s.n.],2014.
[6] REGMI K,SHAH M.Bridging the domain gap for ground-to-aerial image matching[C]//Proceedings of IEEE/CVF International Conference on Computer Vision.Seoul: IEEE,2019: 470-479.
[7] 孙彬.基于跨视角匹配的图像地理位置定位研究[D].深圳: 深圳大学,2019.
[8] HE Kaiming,ZHANG Xiangyu,REN Shaoqing,et al.Deep residual learning for image recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Las Vega: IEEE,2016: 770-778.
[9] YANG Hongji,LU Xiufan,ZHU Yingying.Cross-view geo-localization with layer-to-layer transformer[C]//Proceedings of the 35th International Conference on Neural Information Processing Systems.[S.l.]:NeurIPS,2021.
[10] ZHAO Jianwei,ZHAI Qiang,ZHAO Pengbo,et al.Co-visual pattern-augmented generative transformer learning for automobile geo-localization[J].Remote Sensing,2023,15(9): 2221.
[11] ZHU Sijie,SHAH M,CHEN Chen.TransGeo: transformer is all you need for cross-view image geo-localization[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.New Orleans: IEEE,2022.
[12] HE Kaiming,FAN Haoqi,WU Yuxin,et al.Momentum contrast for unsupervised visual representation learning[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle: IEEE,2020.
[13] CHEN TING,KORNBLITH S,NOROUZI M,et al.A simple framework for contrastive learning of visual representations[C]//Proceedings of 2020 International Conference on Machine Learning.[S.l.]:ICML,2020.
[14] DEUSER F,HABEL K,OSWALD N.Sample4Geo: hard negative sampling for cross-view geo-localisation[C]//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision.Paris: IEEE,2023.
[15] WANG Tingyu,ZHENG Zhedong,YAN Chenggang,et al.Each part matters: local patterns facilitate cross-view geo-localization[J].IEEE Transactions on Circuits and Systems for Video Technology,2022,32(2):867-879.
[16] ZHU Y,YANG H,LU Y,et al.Simple,effective and general: a new backbone for cross-view image geo-localization[EB/OL].2023-11-28[2024-09-05].https://arxiv.org/abs/2302.01572.
[17] WANG Ting,ZHENG Zhedong,ZHU Zunjie,et al.Learning cross-view geo-localization embeddings via dynamic weighted decorrelation regularization[EB/OL].[2024-09-05].https://arxiv.org/abs/2211.05296.
[18] ZHU R,YANG M,YIN L,et al.UAV's status is worth considering: a fusion representations matching method for geo-localization[J].Sensors (Basel),2023,23(2): 720.
[19] ZENG Zelong,WANG Zheng,YANG Fan,et al.Geo-localization via ground-to-satellite cross-view image retrieval[J].IEEE Transactions on Multimedia,2023,25: 2176-2188.
[20] 张呈龙,周杨,胡校飞,等.MRGA-Mix:融合多级特征与关系感知全局注意力的视觉位置识别[J/OL].地球信息科学学报,1-20[2025-03-20].http://kns.cnki.net/kcms/detail/11.5809.P.20240402.1654.009.html.