测绘通报 ›› 2025, Vol. 0 ›› Issue (9): 70-77.doi: 10.13474/j.cnki.11-2246.2025.0912

• 学术研究 • 上一篇    下一篇

ADM-YOLOv11:动态自适应铁塔视频多尺度目标检测算法

唐芝青1, 张涛1, 王培玉2, 向导1, 刘海飞1, 刘仁峰1, 贺江江1   

  1. 1. 湖南省第二测绘院, 湖南 长沙 410000;
    2. 湘潭大学, 湖南 湘潭 411105
  • 收稿日期:2025-05-09 发布日期:2025-09-29
  • 作者简介:唐芝青(1984—),男,硕士,高级工程师,主要研究方向为卫星遥感、铁塔视频监测和地理信息系统研发。E-mail:591674516@qq.com
  • 基金资助:
    湖南省自然科学基金(2024JJ8331)

ADM-YOLOv11:dynamic adaptive multi-scale object detection algorithm for tower video

TANG Zhiqing1, ZHANG Tao1, WANG Peiyu2, XIANG Dao1, LIU Haifei1, LIU Renfeng1, HE Jiangjiang1   

  1. 1. Hunan Second Institute of Surveying and Mapping, Changsha 410000, China;
    2. Xiangtan University, Xiangtan 411105, China
  • Received:2025-05-09 Published:2025-09-29

摘要: 针对铁塔视频大视角动态场景下多尺度目标检测精度不足、小目标识别率低下等技术瓶颈,本文提出一种基于多模块协同优化的ADM-YOLOv11目标检测算法。首先,在Backbone网络中嵌入自适应特征增强模块(AFE),对C3K2模块进行深度重构,通过空间上下文感知与特征精细化机制,显著提升网络对复杂场景的特征提取能力;其次,将高效多尺度注意力模块(EMA)集成至Neck的C3K2模块,增强模型对多尺度目标的检测稳健性;然后,在Neck结构中引入超轻量动态上采样器DySample,以替代传统上采样层,优化多尺度特征的细节表达与语义融合效率;最后,采用EMASlideLoss分类损失函数,通过动态加权策略抑制数据不均衡导致的梯度偏移,有效提升模型泛化性能。试验结果表明,本文模型使mAP50-95从基准模型的74.8%提升至82.6%,mAP50提升至96.6%,ADM-YOLOv11显著提升了铁塔视频动态场景中多尺度目标检测精度。

关键词: 铁塔视频, 多尺度, YOLOv11, 自适应特征增强, 动态上采样

Abstract: Aiming at the technical bottlenecks such as the insufficient detection accuracy of multi-scale targets and the low recognition rate of small targets in the large-view dynamic scenes of tower videos,this paper proposes a ADM-YOLOv11 target detection algorithm based on the collaborative optimization of multiple modules.Firstly,the adaptive feature enhancement(AFE)module is embedded in the Backbone network to deeply reconstruct the C3K2 module.Through the spatial context awareness and feature refinement mechanisms,the feature extraction ability of the network for complex scenes is significantly improved.Secondly,the efficient multi-scale attention (EMA)module is integrated into the C3K2 module of the Neck to enhance the detection robustness of the model for multi-scale targets.Thirdly,the ultra-lightweight dynamic upsampler DySample (dynamic upsample)is introduced into the Neck structure to replace the traditional upsampling layer,optimizing the detail expression and semantic fusion efficiency of multi-scale features.Finally,the EMASlideLoss classification loss function is adopted.By using a dynamic weighting strategy,the problem of gradient shift caused by data imbalance is suppressed,effectively improving the generalization performance of the model.The experimental results show that for the model in this study,the mAP50-95 is increased from 74.8%of the baseline model to 82.6%,and the mAP50 is increased to 96.6%.The ADM-YOLOv11 significantly improves the detection accuracy of multi-scale targets in the dynamic scenes of tower videos.

Key words: tower video, multi-scale, YOLOv11, adaptive feature enhancement, dynamic upsampling

中图分类号: