Deep Symmetric Fusion Transformer for Multimodal Remote Sensing Data Classification

被引:3
|
作者
Chang, Honghao [1 ]
Bi, Haixia [1 ]
Li, Fan [1 ]
Xu, Chen [2 ,3 ]
Chanussot, Jocelyn [4 ]
Hong, Danfeng [5 ,6 ]
机构
[1] Xi An Jiao Tong Univ, Sch Informat & Commun Engn, Xian 710049, Peoples R China
[2] Peng Cheng Lab, Dept Math & Fundamental Res, Shenzhen 518055, Peoples R China
[3] Xi An Jiao Tong Univ, Sch Math & Stat, Xian 710049, Peoples R China
[4] Univ Grenoble Alpes, CNRS, INRIA, Grenoble INP LJK, F-38000 Grenoble, France
[5] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100049, Peoples R China
[6] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 100049, Peoples R China
关键词
Land-cover classification; local-global mixture (LGM); multimodal feature fusion; remote sensing; symmetric fusion transformer (SFT); LAND-COVER CLASSIFICATION; LIDAR DATA;
D O I
10.1109/TGRS.2024.3476975
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
In recent years, multimodal remote sensing data classification (MMRSC) has evoked growing attention due to its more comprehensive and accurate delineation of Earth's surface compared to its single-modal counterpart. However, it remains challenging to capture and integrate local and global features from single-modal data. Moreover, how to fully excavate and exploit the interactions between different modalities is still an intricate issue. To this end, we propose a novel dual-branch transformer-based framework named deep symmetric fusion transformer (DSymFuser). Within the framework, each branch contains a stack of local-global mixture (LGM) blocks, to extract hierarchical and discriminative single-modal features. In each LGM block, a local-global feature mixer with learnable weights is specifically devised to adaptively aggregate the local and global features extracted with a convolutional neural network (CNN)-transformer network. Furthermore, we innovatively design a symmetric fusion transformer (SFT) that trails behind each LGM block. The elaborately designed SFT symmetrically facilitates cross-modal correlation excavation, comprehensively exploiting the complementary cues underlying heterogeneous modalities. The hierarchical construction of the LGM and SFT blocks enables feature extraction and fusion in a multilevel manner, further promoting the completeness and descriptiveness of the learned features. We conducted extensive ablation studies and comparative experiments on three benchmark datasets, and the experimental results validated the effectiveness and superiority of the proposed method. The source code of the proposed method will be available publicly at https://github.com/HaixiaBi1982/DSymFuser.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] DKDFN: Domain Knowledge-Guided deep collaborative fusion network for multimodal unitemporal remote sensing land cover classification
    Li, Yansheng
    Zhou, Yuhan
    Zhang, Yongjun
    Zhong, Liheng
    Wang, Jian
    Chen, Jingdong
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2022, 186 : 170 - 189
  • [42] Causal Meta-Reinforcement Learning for Multimodal Remote Sensing Data Classification
    Zhang, Wei
    Wang, Xuesong
    Wang, Haoyu
    Cheng, Yuhu
    REMOTE SENSING, 2024, 16 (06)
  • [43] Fusion of spatial autocorrelation and spectral data for remote sensing image classification
    Haouas, Fatma
    Ben Dhiaf, Zouhour
    Solaiman, Basel
    2016 2ND INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2016, : 537 - 542
  • [44] Fusion of multisensor remote sensing data for urban land cover classification
    Greiwe, A
    Bochow, M
    Ehlers, M
    REMOTE SENSING FOR ENVIRONMENTAL MONITORING, GIS APPLICATIONS, AND GEOLOGY III, 2004, 5239 : 306 - 313
  • [45] Transformer based ensemble deep learning approach for remote sensing natural scene classification
    Sivasubramanian, Arrun
    Prashanth, V. R.
    Sowmya, V
    Ravi, Vinayakumar
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2024, 45 (10) : 3289 - 3309
  • [46] FTransDeepLab: Multimodal Fusion Transformer-Based DeepLabv3+for Remote Sensing Semantic Segmentation
    Feng, Haixia
    Hu, Qingwu
    Zhao, Pengcheng
    Wang, Shunli
    Ai, Mingyao
    Zheng, Daoyuan
    Liu, Tiancheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [47] Feature Fusion with Deep Supervision for Remote-Sensing Image Scene Classification
    Muhammad, Usman
    Wang, Weiqiang
    Hadid, Abdenour
    2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 249 - 253
  • [48] Adaptive Multiscale Deep Fusion Residual Network for Remote Sensing Image Classification
    Li, Ge
    Li, Lingling
    Zhu, Hao
    Liu, Xu
    Jiao, Licheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (11): : 8506 - 8521
  • [49] Fusion of Deep Learning Models for Improving Classification Accuracy of Remote Sensing Images
    Deepan, P.
    Sudha, L. R.
    JOURNAL OF MECHANICS OF CONTINUA AND MATHEMATICAL SCIENCES, 2019, 14 (05): : 189 - 201
  • [50] Multitask Multisource Deep Correlation Filter for Remote Sensing Data Fusion
    Cheng, Xu
    Zheng, Yuhui
    Zhang, Jianwei
    Yang, Zhangjing
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 (13) : 3723 - 3734