Modality Fusion Vision Transformer for Hyperspectral and LiDAR Data Collaborative Classification

被引:17
|
作者
Yang, Bin [1 ]
Wang, Xuan [2 ]
Xing, Ying [2 ,3 ]
Cheng, Chen [4 ]
Jiang, Weiwei [5 ,6 ]
Feng, Quanlong [7 ]
机构
[1] China Unicom Res Inst, Graph neural network & artificial intelligence tea, Beijing 100032, Peoples R China
[2] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing 100876, Peoples R China
[3] Yunnan Univ, Yunnan Key Lab Software Engn, Kunming 650500, Peoples R China
[4] China Unicom Res Inst, Network Technol Res Ctr, Beijing 100032, Peoples R China
[5] Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing 100876, Peoples R China
[6] Anhui Univ, Key Lab Unive Wireless Commun, Minist Educ, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230039, Peoples R China
[7] China Agr Univ, Geog Informat Engn, Beijing 100083, Peoples R China
关键词
Feature extraction; Laser radar; Transformers; Hyperspectral imaging; Data mining; Data models; Vectors; Cross-attention (CA); hyperspectral image (HSI); light detection and ranging (LiDAR); modality fusion; vision transformer (ViT); EXTINCTION PROFILES;
D O I
10.1109/JSTARS.2024.3415729
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, collaborative classification of multimodal data, e.g., hyperspectral image (HSI) and light detection and ranging (LiDAR), has been widely used to improve remote sensing image classification accuracy. However, existing fusion approaches for HSI and LiDAR suffer from limitations. Fusing the heterogeneous features of HSI and LiDAR proved to be challenging, leading to incomplete utilization of information for category representation. In addition, during the extraction of spatial features from HSI, the spectral and spatial information are often disjointed. It leads to the difficulty of fully exploiting the rich spectral information in hyperspectral data. To address these issues, we proposed a multimodal data fusion framework specifically designed for HSI and LiDAR fusion classification, called modality fusion vision transformer. We have designed a stackable modality fusion block as the core of our model. Specifically, these blocks mainly consist of multimodal cross-attention modules and spectral self-attention modules. The proposed novel multimodal cross-attention module for feature fusion addresses the issue of insufficient fusion of heterogeneous features from HSI and LiDAR for category representation. Compared to other cross-attention methods, it reduces the alignment requirements between modal feature spaces during cross-modal fusion. The spectral self-attention module can preserve spatial features while exploiting the rich spectral information and participating in the process of extracting spatial features from HSI. Ultimately, we achieve overall classification accuracies of 99.91%, 99.59%, and 96.98% on three benchmark datasets respectively, surpassing all state-of-the-art methods, demonstrating the stability and effectiveness of our model.
引用
收藏
页码:17052 / 17065
页数:14
相关论文
共 50 条
  • [41] Joint Classification of Hyperspectral and LiDAR Data via Multiprobability Decision Fusion Method
    Chen, Tao
    Chen, Sizuo
    Chen, Luying
    Chen, Huayue
    Zheng, Bochuan
    Deng, Wu
    REMOTE SENSING, 2024, 16 (22)
  • [42] Information Fusion for Classification of Hyperspectral and LiDAR Data Using IP-CNN
    Zhang, Mengmeng
    Li, Wei
    Tao, Ran
    Li, Hengchao
    Du, Qian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [43] OBJECT-BASED FUSION OF HYPERSPECTRAL AND LIDAR DATA FOR CLASSIFICATION OF URBAN AREAS
    Marpu, Prashanth Reddy
    Martinez, Sergio Sanchez
    2015 7TH WORKSHOP ON HYPERSPECTRAL IMAGE AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS), 2015,
  • [44] URBAN AREA OBJECT-BASED CLASSIFICATION BY FUSION OF HYPERSPECTRAL AND LIDAR DATA
    Kiani, Kamel
    Mojaradi, Barat
    Esmaeily, Ali
    Salehi, Bahram
    2014 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2014,
  • [45] Joint Classification of Hyperspectral and LiDAR Data Based on Adaptive Gating Mechanism and Learnable Transformer
    Wang, Minhui
    Sun, Yaxiu
    Xiang, Jianhong
    Sun, Rui
    Zhong, Yu
    REMOTE SENSING, 2024, 16 (06)
  • [46] Fusion of Multispectral LiDAR, Hyperspectral, and RGB Data for Urban Land Cover Classification
    Haensch, Ronny
    Hellwich, Olaf
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (02) : 366 - 370
  • [47] Fusion of hyperspectral and LIDAR remote sensing data for classification of complex forest areas
    Dalponte, Michele
    Bruzzone, Lorenzo
    Gianelle, Damiano
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2008, 46 (05): : 1416 - 1427
  • [48] Urban classification by multi-feature fusion of hyperspectral image and LiDAR data
    Cao Q.
    Ma A.
    Zhong Y.
    Zhao J.
    Zhao B.
    Zhang L.
    Yaogan Xuebao/Journal of Remote Sensing, 2019, 23 (05): : 892 - 903
  • [49] MULTI-SCALE FEATURE FUSION FOR HYPERSPECTRAL AND LIDAR DATA JOINT CLASSIFICATION
    Zhang, Maqun
    Gao, Feng
    Dong, Junyu
    Qi, Lin
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 2856 - 2859
  • [50] Multi-attentive hierarchical dense fusion net for fusion classification of hyperspectral and LiDAR data
    Wang, Xianghai
    Feng, Yining
    Song, Ruoxi
    Mu, Zhenhua
    Song, Chuanming
    INFORMATION FUSION, 2022, 82 : 1 - 18