测绘通报 ›› 2024, Vol. 0 ›› Issue (5): 12-18.doi: 10.13474/j.cnki.11-2246.2024.0503

• 滑坡灾害识别 • 上一篇    

深度学习滑坡识别算法中样本不平衡问题的研究

王丽霞1, 喜文飞1,2, 史正涛1,2, 赵子龙3, 钱堂慧1, 赵磊1, 马奕捷1   

  1. 1. 云南师范大学地理学部, 云南 昆明 650500;
    2. 云南省高原地理过程与环境变化重点实验室, 云南 昆明 650500;
    3. 云南海钜地理信息技术有限公司, 云南 昆明 650000
  • 收稿日期:2023-09-01 发布日期:2024-06-12
  • 作者简介:王丽霞(1995—),女,硕士生,主要研究方向为深度学习遥感影像滑坡灾害识别。E-mail:690125218@qq.com
  • 基金资助:
    滇中引水工程(滇中高原区)高分综合应用示范(89-Y50-G31-9001-22/23-05);云南省重大科技专项(202202AD080010);云南师范大学研究生科研创新基金(YJSJJ23-B152);云南省高校高烈度地震山区交通走廊工程地质病害早期快速判识与防空重点实验室开放基金(KLGDTC-2021-02)

Study on sample unbalance in landslide recognition algorithm based on depth learning

WANG Lixia1, XI Wenfei1,2, SHI Zhengtao1,2, ZHAO Zilong3, QIAN Tanghui1, ZHAO Lei1, MA Yijie1   

  1. 1. Faculty of Geography, Yunnan Normal University, Kunming 650500, China;
    2. Key Laboratory of Geographical Process and Environmental Change of Yunnan Plateau, Kunming 650500, China;
    3. Yunnan Haiju Geographic Information Technology Co., Ltd., Kunming 650000, China
  • Received:2023-09-01 Published:2024-06-12

摘要: 山体滑坡是一种常见的地质灾害,一旦发生会给自然生态系统和人类造成重大财产损失和人员伤亡,如何快速准确获取滑坡信息对防灾减灾至关重要。传统的深度学习方法对滑坡样本质量依赖性大,但现有的样本质量参差不齐,极少考虑滑坡样本不平衡问题对深度学习模型性能的影响。针对如何通过改善样本质量提升模型精度的问题,本文从样本质量出发,提出了一种基于多源不平衡样本的Faster R-CNN滑坡目标检测方法,通过对多种不平衡样本的集成训练,研究不同样本对模型综合性能的影响。结果表明:①在困难样本不平衡下模型的准确率为85.16%,F1值为0.69,精确率为56.96%,召回率为86.58%、漏检率为0.33,通过强化样本质量后准确率提升2.04%,精确率提升4.29%,召回率提升1.71%,漏检率降低0.04;②在正负样本不平衡下模型准确率为96.03%,F1值为0.78,精确率为64.50%,召回率为97.15%、漏检率为0.09,通过增加困难样本参与训练后,准确率下降8.45%,精确率下降6.93%,召回率下降7.25%,漏检率提升0.18。困难样本对模型综合性能影响更大,通过提高这部分样本质量可以提升模型检测精度。因此,本文提出的方法为解决深度学习中滑坡数据样本不平衡问题提供了参考。

关键词: 深度学习, 滑坡检测, Faster R-CNN, 不平衡样本, GF-2遥感影像

Abstract: Landslides are a common geological disaster that can cause significant property losses and casualties to natural ecosystems and humans once they occur. How to quickly and accurately obtain landslide information is crucial to disaster prevention and mitigation. Traditional deep learning methods depend heavily on the quality of landslide samples, but the quality of existing samples is uneven, and the impact of landslide sample imbalance on the performance of deep learning models is rarely considered. Aiming at the problem of how to improve model accuracy by improving sample quality, this paper proposes a Faster R-CNN landslide target detection method based on multi-source unbalanced samples starting from sample quality. By conducting integrated training on a variety of imbalanced samples, the impact of different samples on the comprehensive performance of the model is studied. The results show that:①The accuracy rate of the model is 85.16%, F1 score of 0.69, precision of 56.96%, recall of 86.58%, and the missed detection rate is 0.33 under the imbalance of difficult samples. After strengthening the sample quality, the accuracy rate increases by 2.04%. The precision increased by 4.29%, the recall rate increased by 1.71%, and the missed detection rate decreased by 0.04. ②Under the imbalance of positive and negative samples, the accuracy rate of the model is 96.03%,F1 score of 0.78, precision of 64.50%, recall of 97.15%, and the missed detection rate is 0.09. After adding difficult samples to participate in the training, the accuracy rate drops by 8.45%. The rate dropped by 6.93%, the recall rate dropped by 7.25%, and the missed detection rate increased by 0.18. Difficult samples have a greater impact on the overall performance of the model. By improving the quality of these samples, the model detection accuracy can be improved. Therefore, the method proposed in this article provides a reference for solving the problem of landslide data sample imbalance in deep learning.

Key words: deep learning, landslide detection, faster R-CNN, imbalanced samples, GF-2

中图分类号: