测绘通报 ›› 2025, Vol. 0 ›› Issue (11): 84-90,153.doi: 10.13474/j.cnki.11-2246.2025.1113

• 学术研究 • 上一篇    下一篇

联合双注意力和多分支损失的多模态遥感影像分类方法

禹小伟, 郑亚东, 梁莉   

  1. 河南省遥感院, 河南 郑州 450000
  • 收稿日期:2025-09-08 发布日期:2025-12-04
  • 作者简介:禹小伟(1971—),男,硕士,高级工程师,主要从事摄影测量与遥感、测绘工程等科研和应用方面工作。E-mail:578567493@qq.com
  • 基金资助:
    河南省优秀青年基金(252300421206)

Land cover classification in multi-modal remote sensing images using dual attention and multi-branch losses

YU Xiaowei, ZHENG Yadong, LIANG Li   

  1. Henan Remote Sensing Institute, Zhengzhou 450000, China
  • Received:2025-09-08 Published:2025-12-04

摘要: 现有的多模态遥感影像土地覆盖分类方法在特征提取、融合质量等方面仍面临众多挑战。基于此,本文提出了一种联合双注意力协同与多分支联合损失的多模态影像分类方法(DATF-Net)。该方法采取互相关特征增强策略,引入通道-空间注意力协同机制,实现了多模态互补特征的充分融合;在Dice损失和交叉熵损失的联合约束下,保证不同分支决策的一致性。在东营多模态影像数据集上进行了消融和对比试验。结果表明,双注意力协同机制和多分支联合损失函数均有助于提升地物分类的精度。同时,相较其他方法,DATF-Net在各类地物精确度及多个总体分类指标上均为最优,尤其在总体分类正确率(OA)和加权平均交并比(FWIoU)两个关键指标上,DATF-Net较次优方法VFesuNet分别提高了7.9%和12.66%。本文方法不仅缓解了SAR影像的斑点噪声干扰,增强了分类边界的连贯性,同时提升了在城市复杂场景下的分类精度与稳健性。

关键词: 多模态遥感影像, 土地覆盖分类, 卷积神经网络, 注意力机制, 损失函数

Abstract: Existing land cover classification methods still face numerous challenges in feature extraction and fusion quality.This paper proposes a multimodal image classification method named dual-attentive triple-branch fusionNet (DATF-Net),which integrate dual-attention collaboration and multi-branch joint loss.A cross-correlation feature enhancement strategy is adopted,and a channel-spatial attention collaboration mechanism is introduced to achieve comprehensive fusion of complementary multimodal features.The consistency of different branch decisions is ensured under the joint constraint of dice loss and cross-entropy loss.Ablation and comparative experiments are conducted on the Dongying multi-modal image dataset.The results demonstrate that the dual-attention collaboration mechanism and multi-branch joint loss function both contribute to improving land cover classification accuracy.Compared with other methods,DATF-Net achieves optimal performance in precision across various land cover categories and multiple overall classification metrics.Notably,The OA and FWIoU of DATF-Net outperform the second-best method (VFesuNet) by 7.9% and 12.66% respectively.The proposed method effectively mitigate the speckle noise interference in SAR images,enhance the coherence of classification boundaries,and improve classification accuracy and robustness in complex urban scenarios.

Key words: multi-modal remote sensing imagery, land cover classification, convolutional neural network, attention mechanism, loss function

中图分类号: