CNN-GCN-based dual-stream network for scene classification of remote sensing images

被引:0
|
作者
Deng P. [1 ]
Xu K. [1 ]
Huang H. [1 ,2 ]
机构
[1] Key Laboratory of Optoelectronic Technology and Systems of the Education Ministry of China, Chongqing University, Chongqing
[2] State Key Laboratory of Coal Mine Disaster Dynamics and Control, Chongqing University, Chongqing
基金
中国国家自然科学基金;
关键词
Convolutional neural network; Feature fusion; Graph neural network; High-resolution image; Remote sensing scene classification;
D O I
10.11834/jrs.20210587
中图分类号
学科分类号
摘要
Scene classification is an important research topic, which aims at assigning a semantic label to a given image. High-Spatial-Resolution (HSR) images contain abundant information of ground objects, such as geometric structure and spatial layout. Complex HSR images are difficult to interpret effectively. Extracting discriminative features is the key step to improve classification accuracy. Various methods for constructing discriminative representations, including handcrafted feature-based methods and deep learning-based methods, have been proposed. The former methods focus on designing different handcrafted features via professional knowledge and describing a scene through single feature or multifeature fusion. However, for a complex scene, handcrafted features show limited discriminative and generalization capabilities. Deep learning-based methods, due to the powerful capability of feature extraction, have made incredible progress in the field of scene classification. Compared with the former methods, Convolutional Neural Networks (CNNs) can automatically extract deep features from massive HSR images. Nevertheless, CNNs merely focus on global information, which makes it fail to explore the context relationship of HSR images. Recently, Graph Convolutional Networks (GCNs) have become an important branch of deep learning, and they have been adopted to model spatial relations hidden in HSR images via graph structure. In this paper, a novel architecture termed CNN-GCN-based Dual-Stream Network (CGDSN) is proposed for scene classification. The CGDSN method contains two modules: CNN and GCN streams. For the CNN stream, the pretrained DenseNet-121 is employed as the backbone to extract the global features of HSR images. In the GCN stream, VGGNet-16 that is pretrained well on ImageNet is introduced to generate feature maps of the last convolutional layer. Then, an average pooling is organized for downsampling before the construction of an adjacency matrix. Given that every image is represented by a graph, a GCN model is developed to demonstrate context relationships. Two graph convolutional layers of the GCN stream are followed by a global average pooling layer and a Fully Connected (FC) layer to form the context features of HSR images. Lastly, to fuse global and context features adequately, a weighted concatenation layer is constructed to integrate them, and an FC layer is introduced to predict scene categories. The AID, RSSCN7, and NWPU-RESISC45 data sets are chosen to verify the effectiveness of the CGDSN method. Experimental results illustrate that the proposed CGDSN algorithm outperforms some state-of-the-art methods in terms of Overall Accuracies (OAs). On the AID data set, the OAs reach 95.62% and 97.14% under the training ratios of 20% and 50%, respectively. On the RSSCN7 data set, the classification result obtained by the CGDSN method is 95.46% with 50% training samples. For the NWPU-RESISC45 data set, the classification accuracies achieved via the CGDSN method are 91.86% and 94.12% under the training ratios of 10% and 20%, respectively. The proposed CGDSN method can extract discriminative features and achieve competitive accuracies for scene classification. © 2021, Science Press. All right reserved.
引用
收藏
页码:2270 / 2282
页数:12
相关论文
共 34 条
  • [1] Anwer R M, Khan F S, van de Weijer J, Molinier M, Laaksonen J., Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification, ISPRS Journal of Photogrammetry and Remote Sensing, 138, pp. 74-85, (2018)
  • [2] Chaib S, Liu H, Gu Y F, Yao H X., Deep feature fusion for VHR remote sensing scene classification, IEEE Transactions on Geoscience and Remote Sensing, 55, 8, pp. 4775-4784, (2017)
  • [3] Cheng G, Han J W, Lu X Q., Remote sensing image scene classification: benchmark and state of the art, Proceedings of the IEEE, 105, 10, pp. 1865-1883, (2017)
  • [4] Cheng G, Yang C Y, Yao X W, Guo L, Han J W., When deep learning meets metric learning: remote sensing image scene classification via learning discriminative CNNs, IEEE Transactions on Geoscience and Remote Sensing, 56, 5, pp. 2811-2821, (2018)
  • [5] Gao Y, Shi J, Li J, Wang R Y., Remote sensing scene classification based on high-order graph convolutional network, European Journal of Remote Sensing, 54, pp. 141-155, (2021)
  • [6] He N J, Fang L Y, Li S T, Plaza J, Plaza A., Skip-connected covariance network for remote sensing scene classification, IEEE Transactions on Neural Networks and Learning Systems, 31, 5, pp. 1461-1474, (2020)
  • [7] Hong D F, Gao L R, Yao J, Zhang B, Plaza A, Chanussot J., Graph convolutional networks for hyperspectral image classification, IEEE Transactions on Geoscience and Remote Sensing, 59, 7, pp. 5966-5978, (2021)
  • [8] Huang G, Liu Z, Van Der Maaten L, Weinberger K Q., Densely connected convolutional networks, Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2261-2269, (2017)
  • [9] Huang H, Xu K J., Combing triple-part features of convolutional neural networks for scene classification in remote sensing, Remote Sensing, 11, 14, (2019)
  • [10] Huang H, Xu K J, Shi G Y., Scene classification of high-resolution remote sensing image by multi-scale and multi-feature fusion, Acta Electronica Sinica, 48, 9, pp. 1824-1833, (2020)