测绘通报 ›› 2024, Vol. 0 ›› Issue (9): 74-79.doi: 10.13474/j.cnki.11-2246.2024.0914

• 学术研究 • 上一篇    下一篇

顾及多方法集成特征选择与负样本优化的滑坡易发性评价

刘伊铭1,2, 徐胜华1,2, 刘春阳1, 马钰2,3   

  1. 1. 安徽理工大学空间信息与测绘工程学院, 安徽 淮南 232000;
    2. 中国测绘科学研究院, 北京 100080;
    3. 辽宁工程技术大学测绘与地理科学学院, 辽宁 阜新 123000
  • 收稿日期:2024-05-15 发布日期:2024-10-09
  • 通讯作者: 徐胜华。E-mail:xushh@casm.ac.cn
  • 作者简介:刘伊铭(1998—),女,硕士生,主要研究方向为滑坡灾害易发性评估。E-mail:17731160570@163.com
  • 基金资助:
    国家重点研发计划(2020YFC1511704);基本业务费项目(AR2310)

Landslide susceptibility assessment considering multi-method integrated feature selection and negative sample optimization

LIU Yiming1,2, XU Shenghua1,2, LIU Chunyang1, MA Yu2,3   

  1. 1. School of Spatial Information and Surveying and Mapping Engineering, Anhui University of Science and Technology, Huainan 232000, China;
    2. Chinese Academy of Surveying and Mapping, Beijing 100080, China;
    3. School of Geomatics, Liaoning Technology University, Fuxin 123000, China
  • Received:2024-05-15 Published:2024-10-09

摘要: 针对滑坡易发性评价中特征因子选取主观性强、滑坡负样本选取随机性强,导致预测精度较低的问题,本文提出了采用多方法集成选择特征因子,并结合信息量法优化负样本提取的滑坡易发性评价方法。以四川省巴中市为例,首先,将最大相关最小冗余(mRMR)、梯度提升树(GBT)、极端梯度提升树(XGBoost)、普通最小二乘法(OLS)、信息增益法(IG)5种方法特征选择的结果归一化并累加得到综合评分。其次,通过信息量法选取负样本构建样本数据集。然后,采用支持向量机(SVM)模型进行滑坡易发性分析,并与逻辑回归(LR)模型做对比试验。最后,从滑坡灾害易发性分区图、点密度统计、ROC曲线3个方面,验证预测结果的准确性。试验结果表明,本文提出的多方法集成特征选择并应用信息量法进行负样本优化可以有效提高模型的预测精度,易发性评价结果更为准确可靠。

关键词: 滑坡灾害, 特征选择, 负样本优化, 机器学习

Abstract: In view of the problem that the selection of characteristic factors in landslide susceptibility evaluation is highly subjective and the selection of landslide negative samples is highly random, resulting in low prediction accuracy, this paper proposes a landslide susceptibility evaluation method that uses multi-method integration to select characteristic factors and combines the information volume method to optimize the extraction of negative samples. Taking Bazhong city, Sichuan province as an example, the results of feature selection by five methods, namely maximum relevance minimum redundancy (mRMR), gradient boosting trees (GBT), extreme gradient boosting (XGBoost), ordinary least squares (OLS), and information gain (IG) are first normalized and accumulated to obtain a comprehensive score. Secondly, negative samples are selected by the information volume method to construct a sample data set. Then, the support vector machine (SVM) model is used to analyze landslide susceptibility, and a comparative experiment is conducted with the logistic regression (LR) model. Finally, the accuracy of the prediction results is verified from three aspects: landslide susceptibility zoning map, point density statistics, and ROC curve. The experimental results show that the multi-method integrated feature selection proposed in this paper and the application of information volume method for negative sample optimization can effectively improve the prediction accuracy of the model, and the susceptibility evaluation results are more accurate and reliable.

Key words: landslide disaster, feature selection, negative sample optimization, machine learning

中图分类号: