Convolutional Neural Network and Vision Transformer-driven Cross-layer Multi-scale Fusion Network for Hyperspectral Image Classification

被引:0
|
作者
Zhao F. [1 ]
Geng M. [1 ]
Liu H. [2 ]
Zhang J. [1 ]
Yu J. [3 ]
机构
[1] School of Communications and Information Engineering, Xi’an University of Posts and Telecommunications, Xi’an
[2] School of Computer Science, Shaanxi Normal University, Xi’an
[3] University of Science and Technology of China, Hefei
基金
中国国家自然科学基金;
关键词
Convolutional Neural Network (CNN); Fusion network; HyperSpectral Image (HSI) classification; Multi-scale features; Vision transformer;
D O I
10.11999/JEIT231209
中图分类号
学科分类号
摘要
HyperSpectral Image (HSI) classification is one of the most prominent research topics in geoscience and remote sensing image processing tasks. In recent years, the combination of Convolutional Neural Network (CNN) and vision transformer has achieved success in HSI classification tasks by comprehensively considering local-global information. Nevertheless, the ground objects of HSIs vary in scale, containing rich texture information and complex structures. The current methods based on the combination of CNN and vision transformer usually have limited capability to extract texture and structural information of multi-scale ground objects. To overcome the above limitations, a CNN and vision transformer-driven cross-layer multi-scale fusion network is proposed for HSI classification. Firstly, from the perspective of combining CNN and visual transformer, a cross-layer multi-scale local-global feature extraction module branch is constructed, which is composed of a convolution embedded vision transformer architecture and a cross-layer feature fusion module. Specifically, to enhance attention to multi-scale ground objects in HSIs, the convolution embedded vision transformer captures multi-scale local-global features effectively by organically combining multi-scale CNN and vision transformer. Furthermore, the cross-layer feature fusion module aggregates hierarchical multi-scale local-global features, thereby combining shallow texture information and deep structural information of ground objects. Secondly, a group multi-scale convolution module branch is designed to explore the potential multiscale features from abundant spectral bands in HSIs. Finally, to mine local spectral details and global spectral information in HSIs, a residual group convolution module is designed to extract local-global spectral features. Experimental results on Indian Pines, Houston 2013, and Salinas Valley datasets confirm the effectiveness of the proposed method. © 2024 Science Press. All rights reserved.
引用
收藏
页码:2237 / 2248
页数:11
相关论文
共 25 条
  • [1] BIOUCAS-DIAS J M, PLAZA A, CAMPS-VALLS G, Et al., Hyperspectral remote sensing data analysis and future challenges[J], IEEE Geoscience and Remote Sensing Magazine, 1, 2, pp. 6-36, (2013)
  • [2] KHAN I H, LIU Haiyan, LI Wei, Et al., Early detection of powdery mildew disease and accurate quantification of its severity using hyperspectral images in wheat, Remote Sensing, 13, 18, (2021)
  • [3] SUN Mingyue, LI Qian, JIANG Xuzi, Et al., Estimation of soil salt content and organic matter on arable land in the yellow river delta by combining UAV hyperspectral and landsat-8 multispectral imagery, Sensors, 22, 11, (2022)
  • [4] STUART M B, MCGONIGLE A J S, WILLMOTT J R., Hyperspectral imaging in environmental monitoring: A review of recent developments and technological advances in compact field deployable systems, Sensors, 19, 14, (2019)
  • [5] BAZI Y, MELGANI F., Toward an optimal SVM classification system for hyperspectral remote sensing images[J], IEEE Transactions on Geoscience and Remote Sensing, 44, 11, pp. 3374-3385, (2006)
  • [6] GU Yanfeng, CHANUSSOT J, JIA Xiuping, Et al., Multiple kernel learning for hyperspectral image classification: A review[J], IEEE Transactions on Geoscience and Remote Sensing, 55, 11, pp. 6547-6565, (2017)
  • [7] LICCIARDI G A, CHANUSSOT J., Nonlinear PCA for visible and thermal hyperspectral images quality enhancement[J], IEEE Geoscience and Remote Sensing Letters, 12, 6, pp. 1228-1231, (2015)
  • [8] ROY S K, KRISHNA G, DUBEY S R, Et al., HybridSN: Exploring 3-D–2-D CNN feature hierarchy for hyperspectral image classification[J], IEEE Geoscience and Remote Sensing Letters, 17, 2, pp. 277-281, (2020)
  • [9] GONG Zhiqiang, ZHONG Ping, YU Yang, Et al., A CNN with multiscale convolution and diversified metric for hyperspectral image classification[J], IEEE Transactions on Geoscience and Remote Sensing, 57, 6, pp. 3599-3618, (2019)
  • [10] MENG Zhe, LI Lingling, JIAO Licheng, Et al., Fully dense multiscale fusion network for hyperspectral image classification, Remote Sensing, 11, 22, (2019)