CMSE: Cross-Modal Semantic Enhancement Network for Classification of Hyperspectral and LiDAR Data

被引:3
|
作者
Han, Wenqi [1 ]
Miao, Wang [1 ]
Geng, Jie [1 ]
Jiang, Wen [1 ]
机构
[1] Northwestern Polytech Univ, Sch Elect & Informat, Xian 710129, Peoples R China
来源
关键词
Semantics; Laser radar; Feature extraction; Hyperspectral imaging; Land surface; Data models; Data mining; Classification; land cover; multimodal; remote sensing (RS); semantic features; IMAGE CLASSIFICATION; NEURAL-NETWORK; FUSION;
D O I
10.1109/TGRS.2024.3368509
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
The fusion of hyperspectral image (HSI) and light detection and ranging (LiDAR) data is widely used for land cover classification. However, due to different imaging mechanisms, HSI and LiDAR data always present significant image differences, and the dimensions and feature distributions of HSI and LiDAR are highly dissimilar. This makes it challenging to represent and correlate semantic information from multimodal data. Current methods for classifying pixel-by-pixel features, which rely on cascaded or attention-based fusion, cannot effectively use multimodal features. To achieve accurate classification results, extracting and fusing similar high-order semantic information and complementary discriminative information contained in multimodal data is vital. In this article, we propose a cross-modal semantic enhancement network (CMSE) for multimodal semantic information mining and fusion. Our proposed CMSE framework extracts features from the image on multiple scales, capturing more representative local sparse features with different sizes of convolution kernels. To represent high-level semantic features related to land cover, we establish a Gaussian-weighted matrix and semantically transform the spatial and spectral features of distinct branches. Finally, we build a multilevel residual fusion module to incrementally fuse spectral features from HSI and elevation features from LiDAR. Additionally, we introduce a cross-modal semantically constrained loss to guide multimodal semantic feature alignment. We evaluate our approach on three multimodal remote sensing (RS) datasets, namely the Houston2013, Trento, and MUUFL datasets. The experimental results demonstrate that our proposed CMSE model achieves superior performance in terms of accuracy and robustness compared to other related deep networks.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
  • [1] A cross-modal feature aggregation and enhancement network for hyperspectral and LiDAR joint classification
    Zhang, Yiyan
    Gao, Hongmin
    Zhou, Jun
    Zhang, Chenkai
    Ghamisi, Pedram
    Xu, Shufang
    Li, Chenming
    Zhang, Bing
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 258
  • [2] Dynamic Cross-Modal Feature Interaction Network for Hyperspectral and LiDAR Data Classification
    Lin, Junyan
    Gao, Feng
    Qi, Lin
    Dong, Junyu
    Du, Qian
    Gao, Xinbo
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [3] S2ENet: Spatial-Spectral Cross-Modal Enhancement Network for Classification of Hyperspectral and LiDAR Data
    Fang, Sheng
    Li, Kaiyu
    Li, Zhe
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [4] Progressive Semantic Enhancement Network for Hyperspectral and LiDAR Classification
    Fu, Xiyou
    Zhou, Xi
    Fu, Yawen
    Liu, Pan
    Jia, Sen
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [5] HR and LiDAR Data Collaborative Semantic Segmentation Based on Adaptive Cross-Modal Fusion Network
    Ye, Zhen
    Li, Zhen
    Wang, Nan
    Li, Yuan
    Li, Wei
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 12153 - 12168
  • [6] Dual-Branch Feature Fusion Network Based Cross-Modal Enhanced CNN and Transformer for Hyperspectral and LiDAR Classification
    Wang, Wuli
    Li, Chong
    Ren, Peng
    Lu, Xinchao
    Wang, Jianbu
    Ren, Guangbo
    Liu, Baodi
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [7] Semantic enhancement and multi-level alignment network for cross-modal retrieval
    Chen, Jia
    Zhang, Hong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (40) : 88221 - 88243
  • [8] Boosting LiDAR-Based Semantic Labeling by Cross-modal Training Data Generation
    Piewak, Florian
    Pinggera, Peter
    Schaefer, Manuel
    Peter, David
    Schwarz, Beate
    Schneider, Nick
    Enzweiler, Markus
    Pfeiffer, David
    Zoellner, Marius
    COMPUTER VISION - ECCV 2018 WORKSHOPS, PT VI, 2019, 11134 : 497 - 513
  • [9] Semantic Guidance Fusion Network for Cross-Modal Semantic Segmentation
    Zhang, Pan
    Chen, Ming
    Gao, Meng
    SENSORS, 2024, 24 (08)
  • [10] MS2CANet: Multiscale Spatial-Spectral Cross-Modal Attention Network for Hyperspectral Image and LiDAR Classification
    Wang, Xianghai
    Zhu, Junheng
    Feng, Yining
    Wang, Lu
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5