Bulletin of Surveying and Mapping ›› 2022, Vol. 0 ›› Issue (5): 38-44.doi: 10.13474/j.cnki.11-2246.2022.0138

Previous Articles     Next Articles

Multi-Patch multi-frame incremental traffic video object detection method based on YOLO v4

WEN Nu1,2,3, GUO Renzhong1,2,3, HE Biao1,2,3,4, WAN Yuan5   

  1. 1. School of Architecture & Urban Planning, Shenzhen University, Shenzhen 518061, China;
    2. Research Institute for Smart Cities, Shengzhen University, Shenzhen 518061, China;
    3. Guangdong-Hong Kong-Macau Joint Laboratory for Smart Cities, Shenzhen 518061, China;
    4. Key Laboratory of Urban Land Resources Monitoring and Simulation, Ministry of Natural Resources, Shenzhen 518034, China;
    5. College of Urban and Environmental Sciences, Hubei Normal University, Huangshi 435002, China
  • Received:2021-06-07 Revised:2022-02-25 Published:2022-06-08

Abstract: Improving the generalization ability of object detection model is a research focus and key issue in the field of computer vision. This paper proposes a Multi-Patch method and a multi-frame incremental prediction strategy to improve the robustness of traffic video object detection in different scenarios, and effectively solve the problem of low object recall ratio in videos caused by variable object scales. According to the video resolution and object size, the video frame is automatically divided into the best input size based on the Multi-Patch method, the YOLO v4 neural network is used to correlate the context information of the continuous frame, and the incremental prediction strategy is used to reduce the missed detection rate of the video object detection, and to improve the detection confidence score and recall rate of video object in different scenarios. Collect traffic videos under different shooting conditions to verify the effectiveness of the algorithm. Experimental results show that the object detection method proposed in this paper has a recall rate of more than 80% and an average confidence score of more than 0.84.

Key words: video object detection, multi-frame fusion, YOLO v4, convolutional neural networks

CLC Number: