ADM-YOLOv11:dynamic adaptive multi-scale object detection algorithm for tower video

doi:10.13474/j.cnki.11-2246.2025.0912

Abstract

Abstract: Aiming at the technical bottlenecks such as the insufficient detection accuracy of multi-scale targets and the low recognition rate of small targets in the large-view dynamic scenes of tower videos,this paper proposes a ADM-YOLOv11 target detection algorithm based on the collaborative optimization of multiple modules.Firstly,the adaptive feature enhancement(AFE)module is embedded in the Backbone network to deeply reconstruct the C3K2 module.Through the spatial context awareness and feature refinement mechanisms,the feature extraction ability of the network for complex scenes is significantly improved.Secondly,the efficient multi-scale attention (EMA)module is integrated into the C3K2 module of the Neck to enhance the detection robustness of the model for multi-scale targets.Thirdly,the ultra-lightweight dynamic upsampler DySample (dynamic upsample)is introduced into the Neck structure to replace the traditional upsampling layer,optimizing the detail expression and semantic fusion efficiency of multi-scale features.Finally,the EMASlideLoss classification loss function is adopted.By using a dynamic weighting strategy,the problem of gradient shift caused by data imbalance is suppressed,effectively improving the generalization performance of the model.The experimental results show that for the model in this study,the mAP_50-95 is increased from 74.8%of the baseline model to 82.6%,and the mAP₅₀ is increased to 96.6%.The ADM-YOLOv11 significantly improves the detection accuracy of multi-scale targets in the dynamic scenes of tower videos.

Key words: tower video, multi-scale, YOLOv11, adaptive feature enhancement, dynamic upsampling

CLC Number:

P237

TANG Zhiqing, ZHANG Tao, WANG Peiyu, XIANG Dao, LIU Haifei, LIU Renfeng, HE Jiangjiang. ADM-YOLOv11:dynamic adaptive multi-scale object detection algorithm for tower video[J]. Bulletin of Surveying and Mapping, 2025, 0(9): 70-77.

References

[1] 文志军, 魏鹏程, 张之政, 等.一种正射遥感影像辅助的铁塔视频地理定位方法[J].测绘通报, 2025(2):113-117.
[2] 杨国柱, 孙诗睿, 田茂杰, 等.基于高分影像和改进YOLOv7模型在输电线路走廊的建筑物识别[J].测绘通报, 2025(4):82-89.
[3] 蓝贵文, 徐梓睿, 任新月, 等.基于YOLOv8n改进的航拍输电线路图像多类电力部件检测算法[J].测绘通报, 2024(9):38-43.
[4] 肖海林, 田波, 胡彬, 等.基于信道先验多尺度跨轴注意YOLO的无人机视角下多尺度小目标检测算法[J/OL].计算机应用:1-10[2025-07-18].https://kns-cnki-net.webvpn.usst.edu.cn/kcms/detail/51.1307.tp.20250413.0157.002.html.
[5] 秦乐, 谭泽富, 雷国平, 等.EMF-YOLO:轻量化多尺度特征提取路面缺陷检测算法[J/OL].计算机工程与应用:1-12[2025-07-18].https://kns-cnkj-net.webvpn.usst.deu.cn/kcms/detail/11.2127.tp.20250321.1618.010.html.
[6] 陈林豪, 李广明, 申京傲, 等.改进YOLOv8的航拍图像森林火灾检测算法[J].东莞理工学院学报, 2025, 32(1):48-56.
[7] 贺智轩, 陈里里, 王翔, 等.DMF-YOLOv11:基于改进YOLOv11n的无人机航拍图像目标检测算法[J/OL].计算机工程与应用:1-14[2025-07-18].https://kns-cnkj-net.webvpn.usst.edu.cn/kcms/detail/11.2127.tp.20250405.2136.016.html.
[8] KHANAM R, HUSSAIN M.Yolov11:an overview of the key architectural enhancements[J].Computer Vision and Pattern Recognition, 2024:2410.17725.
[9] 徐彦威, 李军, 董元方, 等.YOLO系列目标检测算法综述[J].计算机科学与探索, 2024, 18(9):2221-2238.
[10] SOHAN M, SAI RAM T, RAMI REDDY C V.A review on YOLOv8 and its advancements[M]//Data Intelligence and Cognitive Informatics.Singapore:Springer Nature Singapore, 2024:529-545.
[11] ALI M, JAVAID M, NOMAN M, et al.Fanet:feature amplification network for semantic segmentation in cluttered background[C]//Proceedings of 2024 IEEE International Conference on Image Processing (ICIP).Abu Dhabi:IEEE, 2024:2592-2598.
[12] OUYANG Daliang, HE Su, ZHANG Guozhong, et al.Efficient multi-scale attention module with cross-spatial learning[C]//Proceedings of 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).Rhodes Island:IEEE, 2023:1-5.
[13] LIU Wenze, LU Hao, FU Hongtao, et al.Learning to upsample by learning to sample[C]//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision (ICCV).Paris:IEEE, 2023:6004-6014.
[14] BA J L, KIROS J R, HINTON G E.Layer normalization[J].Machine Learning, 2016:1607.06450.
[15] CHOLLET F.Xception:deep learning with depthwise separable convolutions[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Honolulu:IEEE, 2017:1800-1807.
[16] LI Jiachen, HASSANI A, WALTON S, et al.ConvMLP:hierarchical convolutional MLPs for vision[C]//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).Vancouver, BC:IEEE, 2023:6307-6316.