基于YOLOv8-CPCA的跨模态水下沉船识别算法

doi:10.13474/j.cnki.11-2246.2026.0411

测绘通报 ›› 2026, Vol. 0 ›› Issue (4): 73-80.doi: 10.13474/j.cnki.11-2246.2026.0411

基于YOLOv8-CPCA的跨模态水下沉船识别算法

孙颢桉, 王朝莹, 王语

信息工程大学地理空间信息学院, 河南郑州 450001

收稿日期:2025-09-15 发布日期:2026-05-12
通讯作者: 王朝莹。E-mail:xdyy121@163.com
作者简介:孙颢桉(2004—),女,研究方向为海洋测绘。E-mail:511886801@qq.com
基金资助:
国家自然科学基金(42574011);重点实验室领域基金(2025-JCJQ-JJ-0286)

Cross-modal underwater shipwreck recognition algorithm based on YOLOv8-CPCA

SUN Hao'an, WANG Zhaoying, WANG Yu

School of Geospatial Information, Information Engineering University, Zhengzhou 450001, China

Received:2025-09-15 Published:2026-05-12

摘要/Abstract

摘要： 本文针对水下目标识别中声学数据受噪声影响大、易受干扰,光学数据获取难度随深度增加而显著上升的现象,提出将声学图像与光学图像融合的方法,以提高水下目标识别的精度。由于水下对应声学图像与光学图像数据集稀缺,使用Cycle-GAN网络进行数据集样本扩增,并对生成的数据集进行图像增强处理。在目标识别算法领域,将Transformer跨模态注意力模块和通道先验卷积注意力机制引入YOLOv8算法。使用侧扫声呐沉船目标数据集试验的结果表明,跨模态声光融合的目标识别算法相比基于单一声学或光学数据的目标识别算法而言,均值平均准确率分别提高了0.175和0.165。在目标识别算法中,构建的优化主干网络能够实现声学和光学特征的跨模态整合,提升了特征提取效率,解决了水下沉船边缘与周边环境难以区分的难题。

关键词: 目标识别, 声光融合, Cycle-GAN, 跨模态注意力, YOLO算法, 融合算法, Transformer

Abstract: In underwater target recognition,acoustic data is greatly affected by noise and susceptible to interference,while the difficulty of obtaining optical data increases significantly with depth.This paper proposes a method of fusing acoustic and optical images to improve the accuracy of underwater target recognition.To address the scarcity of corresponding acoustic and optical image datasets for underwater targets,a Cycle-GAN network is employed for dataset sample augmentation,followed by image enhancement processing on the generated dataset.In the field of target recognition algorithms,the Transformer cross-modal attention module and the channel prior convolutional attention mechanism are integrated into the YOLOv8 algorithm to improve target recognition accuracy and precision.This study utilizes a sonar-scanned shipwreck target dataset.Experimental results indicate that to the cross-modal acoustic-optical fusion target in target recognition algorithms,the average and average accuracy rates have respectively improved 0.175 and 0.165. Constructed optimized backbone network enables cross-modal integration of acoustic and optical features,enhancing the efficiency of feature extraction and addressing the challenge of distinguishing the edges of submerged shipwrecks from their surrounding environments.

Key words: target recognition, acoustic-optical fusion, Cycle-GAN, cross-modal attention, YOLO algorithm, fusion algorithm, Transformer

中图分类号:

P237

孙颢桉, 王朝莹, 王语. 基于YOLOv8-CPCA的跨模态水下沉船识别算法[J]. 测绘通报, 2026, 0(4): 73-80.

SUN Hao'an, WANG Zhaoying, WANG Yu. Cross-modal underwater shipwreck recognition algorithm based on YOLOv8-CPCA[J]. Bulletin of Surveying and Mapping, 2026, 0(4): 73-80.

参考文献

[1] 黄海宁,李宝奇,刘纪元,等.声呐图像水下目标识别综述与展望[J].电子与信息学报,2024,46(5):1742-1760.
[2] 吴双,陈婷婷.深度强化学习在水下目标识别中的应用研究[J].舰船电子工程,2024,44(6):181-187.
[3] 王奎阳.基于水下光学图像的目标检测算法研究[D].桂林:桂林电子科技大学,2024.
[4] 丁元明,徐利华,侯孟珂.基于深度学习的水下目标识别技术[J].舰船科学技术,2024,46(1):143-147.
[5] 田禹,田文峰,苏宇辰,等.基于深度神经网络的水下目标识别技术研究[J].科技资讯,2023,21(17):13-16.
[6] 曹宇,李佳阳,王芳.基于改进YOLOv8n的水下鱼类目标识别轻量化模型[J].上海海洋大学学报,2025,34(1):188-200.
[7] 葛慧林,戴跃伟,朱志宇,等.基于改进YOLOv7声光融合水下目标检测方法[J].舰船科学技术,2023,45(12):122-127.
[8] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scalelmage recognition[EB/OL].[2025-09-10].https://doi.org/10.488550/arxiv.1409.1556.
[9] HE Kaiming,ZHANG Xiangyu,REN Shaoqing,et al.Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Las Vegas:IEEE,2016:770-778.
[10] GIRSHICK R.Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV).Santiago:IEEE,2015:1440-1448.
[11] REN Shaoqing,HE Kaiming,GIRSHICK R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[12] LIU Wei,ANGUELOV D,ERHAN D,et al.SSD:single shot MultiBox detector[C]//Proceedings of 2016 Computer Vision-ECCV.Cham:Springer,2016:21-37.
[13] REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:unified,real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Las Vegas:IEEE,2016:779-788.
[14] ZHU Junyan,PARK T,ISOLA P,et al.Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV).Venice:IEEE,2017:2242-2251.
[15] ANCUTI C,ANCUTI C O,HABER T,et al.Enhancing underwater images and videos by fusion[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition.Providence:IEEE,2012:81-88.
[16] 周欢,张陈卓,吴沛然,等.基于多尺度特征融合的水下细粒度目标识别方法[J].应用科技,2024,51(3):65-71.
[17] 檀盼龙,吴小兵,张晓宇.基于声呐图像的水下目标识别研究综述[J].数字海洋与水下攻防,2022,5(4):342-353.
[18] 岳成海,王旭,宫俊玲,等.基于深度学习的水下目标声学识别与定位技术研究[J].数字海洋与水下攻防,2021,4(6):492-497.

基于YOLOv8-CPCA的跨模态水下沉船识别算法

Cross-modal underwater shipwreck recognition algorithm based on YOLOv8-CPCA

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	王少宁, 杨元维, 许磊, 顾世成, 高贤君, 尹正浩, 钟康, 刘振宇. 多特征模板匹配的多源盾构隧道螺栓孔精准识别[J]. 测绘通报, 2025, 0(3): 59-65.
[2]	张广伟. 一种探地雷达与深度学习的隧道衬砌健康评价方法[J]. 测绘通报, 2025, 0(3): 122-126,149.
[3]	施坤涛, 朱长明, 张新, 杨帆, 张琨, 高宏进. 联合分布式摄像头的山岳型景区游客精准时空信息近场感知[J]. 测绘通报, 2025, 0(1): 72-77.
[4]	郭紫祎, 张红娟, 赵智博, 李必军. 基于路侧激光雷达的车辆目标跟踪与定位[J]. 测绘通报, 2024, 0(12): 84-89.
[5]	周宝定, 谢沛瑶, 郭文浩, 毛庆洲. 基于激光点云灰度图像的隧道渗水病害检测[J]. 测绘通报, 2023, 0(8): 34-39,90.
[6]	蔡军, 温日强, 江伟, 严娇, 卢丽娟. 传统村落航拍图像中民族建筑识别的DeepLabV3+改进算法[J]. 测绘通报, 2023, 0(4): 49-53.
[7]	徐海燕, 郭为人, 李德民, 郝君, 徐刚. 实景三维模型纹理的敏感目标自动识别与脱密方法[J]. 测绘通报, 2023, 0(12): 153-158.
[8]	王步云, 李宏伟, 赵姗. 融合点云与全景影像的路侧多目标识别[J]. 测绘通报, 2023, 0(10): 40-46.
[9]	皇甫润南, 田江鹏, 游雄, 屠铱成. 面向无人平台的视觉空间关系模型[J]. 测绘通报, 2023, 0(10): 34-39,66.
[10]	冯炜明, 张新长, 孙颖, 姜明, 甘巧, 侯幸幸. 融合Transformer结构的高分辨率遥感影像变化检测网络[J]. 测绘通报, 2022, 0(8): 36-40,92.
[11]	孙薇, 李胜, 江鹢, 柯水松, 夏安科. 深度学习方法应用于深圳市多调合一工作的思考[J]. 测绘通报, 2021, 0(1): 138-141.
[12]	刘相云, 龚志辉, 金飞, 杨光, 范炜康. 结合显著图和深度学习的遥感影像飞机目标识别[J]. 测绘通报, 2019, 0(3): 27-31.
[13]	黄鹤, 张新宇, 仇凯悦. PDR算法对地磁室内定位精度的提升研究[J]. 测绘通报, 2019, 0(12): 18-21.
[14]	邹进贵, 万荧, 孟丽媛. 一种基于自适应权重SAD与Census融合的匹配算法[J]. 测绘通报, 2018, 0(11): 11-15.
[15]	叶昕秦其明王俊王建华. 结合数学形态学与多角度模板匹配的高分辨率遥感图像救灾帐篷识别[J]. 测绘通报, 2015, 0(1): 86-89.