测绘通报 ›› 2023, Vol. 0 ›› Issue (6): 68-74,103.doi: 10.13474/j.cnki.11-2246.2023.0170

• 学术研究 • 上一篇    下一篇

基于XGBoost算法的近紫外通道地表反射率模拟

奥勇1, 李红丽2,3, 张文娟3, 秦梦2,3   

  1. 1. 长安大学土地工程学院, 陕西 西安 710054;
    2. 长安大学地球科学与资源学院, 陕西 西安 710054;
    3. 中国科学院空天信息创新研究院, 北京 100094
  • 收稿日期:2022-11-10 修回日期:2023-04-20 发布日期:2023-07-05
  • 通讯作者: 张文娟。E-mail:zhangwj@aircas.ac.cn
  • 作者简介:奥勇(1965-),男,博士,副教授,研究方向为遥感科学与技术、地理信息系统。E-mail:aoyong@chd.edu.cn
  • 基金资助:
    中国科学院青年创新促进会资助项目(2019132)

Near-ultraviolet channel surface reflectance simulation based on XGBoost algorithm

AO Yong1, LI Hongli2,3, ZHANG Wenjuan3, QIN Meng2,3   

  1. 1. School of Land Engineering, Chang'An University, Xi'an 710054, China;
    2. School of Earth Science and Resources, Chang'An University, Xi'an 710054, China;
    3. Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China
  • Received:2022-11-10 Revised:2023-04-20 Published:2023-07-05

摘要: 紫外谱段在全球极光探测、海洋溢油、大气辉光等领域具有重要应用价值,其地表反射特性是研究中的重要背景数据,但现有卫星数据资源较少,难以满足应用需求。针对这一问题,本文提出了基于机器学习XGBoost算法的近紫外(350~400 nm)通道地表反射率数据模拟方法。首先,选取Sentinel-2 MSI 2、3、4通道多光谱数据为数据源,结合其通道特点基于USGS地物光谱数据库获取植被、水体、土壤等典型地物光谱数据,并等效计算到相应通道。其次,对数据源和待模拟通道开展相关性分析,Sentinel-2 MSI 2、3、4通道与待模拟通道相关系数均大于0.88,这表明基于该数据源可开展近紫外地表反射率数据模拟。然后,基于等效计算后的典型地物光谱数据集,利用XGBoost算法构建近紫外通道地表反射率回归模型。精度分析结果表明,所有通道模型决定系数(R2)均达到0.91以上,均方根误差(RMSE)均小于0.076,平均绝对误差百分比(MAPE)整体在20%以内,且上述3个精度指标针对不同类别样本的标准差在0.021 2范围内,可见模型精度较高,同时具有良好的稳健性。最后,基于Sentinel-2 MSI 2、3、4通道图像数据,生成355、365、375、385、395 nm的地表反射率模拟图像,图像较好地体现了地物光谱特性。

关键词: 近紫外, 地表反射率模拟, 机器学习, XGBoost

Abstract: The ultraviolet spectrum has significant applications in the fields of global auroral detection,marine oil spill,atmospheric glow,etc. Surface reflectance is important background data in the research. However,the existing satellite data resources are relatively insufficient to meet the application needs. In this study,a machine learning-based on XGBoost algorithm is proposed for simulating surface reflectance data in the near-ultraviolet (N-UV) (350~400 nm) spectral channel. Firstly,Sentinel-2 MSI 2,3 and 4 channels are selected as the data source and the spectral of vegetation,water,soil and other typical features are extract based on the USGS spectral database,then equivalently calculated to the corresponding channels. Secondly,the correlation analysis between the data source and the channel to be simulated is carried out. The correlation coefficients between Sentinel-2 MSI 2,3 and 4 channels and the channels to be simulated are all greater than 0.88,which indicates that the N-UV surface reflectance simulation can be carried out based on this data source. Thirdly,based on the typical spectral data set after the equivalent calculation construct XGBoost regression model to simulate the N-UV channel surface reflectance. Results indicate that the coefficient of determination (R2) of all the channel models is above 0.91,the root mean square error (RMSE) is less than 0.076,the mean absolute error percentage (MAPE) is within 20%,and the standard deviation of the above three accuracy indicators for different categories of samples is within 0.021 2,which shows that the model has high accuracy and robustness. Finally,based on the Sentinel-2 MSI 2,3 and 4 channels image data,the simulated images of surface reflectance at 355,365,375,385 and 395 nm are generated,and the images better reflect the spectral characteristics of the surface.

Key words: near-ultraviolet, surface reflectance simulation, machine learning, XGBoost

中图分类号: