测绘通报 ›› 2026, Vol. 0 ›› Issue (5): 44-49,79.doi: 10.13474/j.cnki.11-2246.2026.0509

• 第二十八届中国科协年会学术论文 • 上一篇    

面向多云地区的置信度引导多模态Transformer水稻提取

王军强1,2, 孙震辉3, 孟庆岩1,4, 张琳琳1,4   

  1. 1. 土壤中心嘉善双碳创新研究院, 浙江 嘉兴 314117;
    2. 生态环境部土壤与农业农村生态环境监管技术中心, 北京 100012;
    3. 天津城建大学地质与测绘学院, 天津 300384;
    4. 中国科学院空天信息创新研究院, 北京 100101
  • 收稿日期:2026-04-14 发布日期:2026-06-09
  • 通讯作者: 孙震辉。E-mail:sunzh@tcu.edu.cn
  • 作者简介:王军强(1993—),男,硕士,工程师,主要研究方向为农业农村减污降碳。E-mail:wangjunqiang@tcare-mee.cn
  • 基金资助:
    生态环境部土壤与农业农村生态环境监管技术中心双碳课题(2024-11);风云三号03批气象卫星工程地面应用系统生态监测评估应用项目(第一期)(ZQC-R22227);天津市科学技术普及项目(24KPHDRC00240)

Confidence-guided multi-modal Transformer for rice extraction in cloud-prone regions

WANG Junqiang1,2, SUN Zhenhui3, MENG Qingyan1,4, ZHANG Linlin1,4   

  1. 1. Innovation Institute of Carbon Peaking and Carbon Neutrality, TCARE &Jiashan, Jiaxing 314117, China,;
    2. Technical Centre for Soil, Agriculture and Rural Ecology and Environment, Ministry of Ecology and Environment, Beijing 100012, China;
    3. School of Geology and Geomatics, Tianjin Chengjian University, Tianjin 300384, China;
    4. Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China
  • Received:2026-04-14 Published:2026-06-09

摘要: [目的]多云地区的光学遥感影像易受云污染影响,导致作物分类精度下降。合成孔径雷达(SAR)具有全天候成像能力,可与光学数据互补。因此,本文提出了一种置信度引导的多模态Transformer水稻提取方法。[方法]该方法首先基于HLS光学与Sentinel-1雷达时间序列数据,利用云掩膜估计各时相光学数据的置信度;然后通过Transformer编码器提取光学与雷达数据的时序特征,并设计置信度引导的门控融合机制自适应融合两者;最后引入自监督时序重建策略,增强模型在光学数据缺失情况下SAR信息对光学的补偿能力。[结果]试验结果表明,本文方法总体精度达88.00%,F1分数为88.25%,优于随机森林、LSTM、Transformer等对比模型。[结论]本文模型能够有效提升多云地区水稻提取精度,为复杂气候条件下的作物提取提供参考。

关键词: 水稻分类, 多模态融合, 时间序列, 置信度引导, 自监督学习

Abstract: [Purposes]Optical remote sensing images of cloudy areas are susceptible to cloud contamination,leading to decreased crop classification accuracy.Synthetic aperture radar (SAR)possesses all-weather imaging capabilities and can complement optical data.This paper proposes a confidence-guided multi-modal Transformer rice extraction method.[Methods]Firstly,based on HLS optical and Sentinel-1 radar time-series data,This method uses cloud masks to estimate the confidence of optical data at each time phase.Then,a Transformer encoder extracts the temporal features of both optical and radar data separately,and a confidence-guided gating fusion mechanism is designed to adaptively fuse the two.Furthermore,a self-supervised temporal reconstruction strategy is introduced,which enhances the model's ability to compensate for SAR information when optical data is missing by performing mask reconstruction on some optical data,thus improving model robustness.[Findings] Experiments show that the proposed method achieves an overall accuracy of 88.00% and a F1 score of 88.25%,outperforming comparative models such as random forest,LSTM,and Transformer.[Conclusions] It effectively improves the rice extraction accuracy in cloudy areas and provides a reference for crop classification under complex climatic conditions.

Key words: rice classification, mult-imodal fusion, time series, confidence-guide, self-supervised learning

中图分类号: