Deep Symmetric Fusion Transformer for Multimodal Remote Sensing Data Classification

被引：3

作者：

Chang, Honghao ^{[1
]}

Bi, Haixia ^{[1
]}

Li, Fan ^{[1
]}

Xu, Chen ^{[2
,3
]}

Chanussot, Jocelyn ^{[4
]}

Hong, Danfeng ^{[5
,6
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Informat & Commun Engn, Xian 710049, Peoples R China

[2] Peng Cheng Lab, Dept Math & Fundamental Res, Shenzhen 518055, Peoples R China

[3] Xi An Jiao Tong Univ, Sch Math & Stat, Xian 710049, Peoples R China

[4] Univ Grenoble Alpes, CNRS, INRIA, Grenoble INP LJK, F-38000 Grenoble, France

[5] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100049, Peoples R China

[6] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 100049, Peoples R China

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷

关键词：

Land-cover classification; local-global mixture (LGM); multimodal feature fusion; remote sensing; symmetric fusion transformer (SFT); LAND-COVER CLASSIFICATION; LIDAR DATA;

D O I：

10.1109/TGRS.2024.3476975

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

In recent years, multimodal remote sensing data classification (MMRSC) has evoked growing attention due to its more comprehensive and accurate delineation of Earth's surface compared to its single-modal counterpart. However, it remains challenging to capture and integrate local and global features from single-modal data. Moreover, how to fully excavate and exploit the interactions between different modalities is still an intricate issue. To this end, we propose a novel dual-branch transformer-based framework named deep symmetric fusion transformer (DSymFuser). Within the framework, each branch contains a stack of local-global mixture (LGM) blocks, to extract hierarchical and discriminative single-modal features. In each LGM block, a local-global feature mixer with learnable weights is specifically devised to adaptively aggregate the local and global features extracted with a convolutional neural network (CNN)-transformer network. Furthermore, we innovatively design a symmetric fusion transformer (SFT) that trails behind each LGM block. The elaborately designed SFT symmetrically facilitates cross-modal correlation excavation, comprehensively exploiting the complementary cues underlying heterogeneous modalities. The hierarchical construction of the LGM and SFT blocks enables feature extraction and fusion in a multilevel manner, further promoting the completeness and descriptiveness of the learned features. We conducted extensive ablation studies and comparative experiments on three benchmark datasets, and the experimental results validated the effectiveness and superiority of the proposed method. The source code of the proposed method will be available publicly at https://github.com/HaixiaBi1982/DSymFuser.

引用

页数：15

共 50 条

[1] Multimodal Fusion Transformer for Remote Sensing Image Classification
Roy, Swalpa Kumar
Deria, Ankur
Hong, Danfeng
Rasti, Behnood
Plaza, Antonio
Chanussot, Jocelyn
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[2] A multimodal hyper-fusion transformer for remote sensing image classification
Ma, Mengru
Ma, Wenping
Jiao, Licheng
Liu, Xu
Li, Lingling
Feng, Zhixi
Liu, Fang
Yang, Shuyuan
INFORMATION FUSION, 2023, 96 : 66 - 79
[3] Fractional Fourier Image Transformer for Multimodal Remote Sensing Data Classification
Zhao, Xudong
Zhang, Mengmeng
Tao, Ran
Li, Wei
Liao, Wenzhi
Tian, Lianfang
Philips, Wilfried
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 2314 - 2326
[4] Deep Fusion of Remote Sensing Data for Accurate Classification
Chen, Yushi
Li, Chunyang
Ghamisi, Pedram
Jia, Xiuping
Gu, Yanfeng
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2017, 14 (08) : 1253 - 1257
[5] Scale Adaptive Fusion Network for Multimodal Remote Sensing Data Classification
Liu, Xiaomin
Yu, Mengjun
Qiao, Zhenzhuang
Wang, Haoyu
Xing, Changda
Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (09): : 3693 - 3702
[6] Deep learning in multimodal remote sensing data fusion: A comprehensive review
Li, Jiaxin
Hong, Danfeng
Gao, Lianru
Yao, Jing
Zheng, Ke
Zhang, Bing
Chanussot, Jocelyn
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 112
[7] A Multilevel Multimodal Fusion Transformer for Remote Sensing Semantic Segmentation
Ma, Xianping
Zhang, Xiaokang
Pun, Man-On
Liu, Ming
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
[8] Deep learning decision fusion for the classification of urban remote sensing data
Abdi, Ghasem (ghasem.abdi@ut.ac.ir), 1600, SPIE (12):
[9] Deep learning decision fusion for the classification of urban remote sensing data
Abdi, Ghasem
Samadzadegan, Farhad
Reinartz, Peter
JOURNAL OF APPLIED REMOTE SENSING, 2018, 12 (01):
[10] HGR Correlation Pooling Fusion Framework for Recognition and Classification in Multimodal Remote Sensing Data
Zhang, Hongkang
Huang, Shao-Lun
Kuruoglu, Ercan Engin
REMOTE SENSING, 2024, 16 (10)

← 1 2 3 4 5 →