Cross-Modal Contrastive Learning for Remote Sensing Image Classification

被引:15
|
作者
Feng, Zhixi [1 ]
Song, Liangliang [1 ]
Yang, Shuyuan [1 ]
Zhang, Xinyu [1 ]
Jiao, Licheng [1 ]
机构
[1] Xidian Univ, Sch Artificial Intelligence, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-modal contrastive learning (CMCL); multimodal remote sensing image (MRSI) classification; self-supervised; LIDAR DATA; FUSION;
D O I
10.1109/TGRS.2023.3296703
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Recently, multimodal remote sensing image (MRSI) classification has attracted increasing attention from researchers. However, the classification of MRSI with limited labeled instances is still a challenging task. In this article, a novel self-supervised cross-modal contrastive learning (CMCL) method is proposed for MRSI classification. Joint intramodal contrastive learning (IMCL) and CMCL are used to better mine multimodal feature representations during pretraining, and the IMCL and CMCL objectives are jointly optimized, whereby it encourages the learned representation to be semantically consistent within and between modalities simultaneously. Moreover, a simple but effective hybrid cross-modal fusion module (HCFM) is designed in the fine-tuning stage, which could better compactly integrate complementary information across these modalities for more accurate classification. Extensive experiments are taken on four benchmark datasets (i.e., Houston 2013, Augsburg, Germany; Trento, Italy; and Berlin, Germany), and the results show that the proposed method outperforms state-of-the-art methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Remote Sensing Image Scene Classification Based on Supervised Contrastive Learning
    Guo Dongen
    Xia Ying
    Luo Xiaobo
    Feng Jiangfan
    ACTA PHOTONICA SINICA, 2021, 50 (07)
  • [32] Remote Sensing Cross-Modal Retrieval by Deep Image-Voice Hashing
    Zhang, Yichao
    Zheng, Xiangtao
    Lu, Xiaoqiang
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 9327 - 9338
  • [33] Exploring Uni-Modal Feature Learning on Entities and Relations for Remote Sensing Cross-Modal Text-Image Retrieval
    Zhang, Shun
    Li, Yupeng
    Mei, Shaohui
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [34] Deep Cross-Modal ImageVoice Retrieval in Remote Sensing
    Chen, Yaxiong
    Lu, Xiaoqiang
    Wang, Shuai
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2020, 58 (10): : 7049 - 7061
  • [35] Hypersphere-Based Remote Sensing Cross-Modal Text-Image Retrieval via Curriculum Learning
    Zhang, Weihang
    Li, Jihao
    Li, Shuoke
    Chen, Jialiang
    Zhang, Wenkai
    Gao, Xin
    Sun, Xian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [36] Cross-modal contrastive learning for aspect-based recommendation
    Won, Heesoo
    Oh, Byungkook
    Yang, Hyeongjun
    Lee, Kyong-Ho
    INFORMATION FUSION, 2023, 99
  • [37] Enriched Music Representations With Multiple Cross-Modal Contrastive Learning
    Ferraro, Andres
    Favory, Xavier
    Drossos, Konstantinos
    Kim, Yuntae
    Bogdanov, Dmitry
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 733 - 737
  • [38] Cross-modal Contrastive Learning for Multimodal Fake News Detection
    Wang, Longzheng
    Zhang, Chuang
    Xu, Hongbo
    Xu, Yongxiu
    Xu, Xiaohan
    Wang, Siqi
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5696 - 5704
  • [39] Momentum Cross-Modal Contrastive Learning for Video Moment Retrieval
    Han, De
    Cheng, Xing
    Guo, Nan
    Ye, Xiaochun
    Rainer, Benjamin
    Priller, Peter
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 5977 - 5994
  • [40] Improving Spoken Language Understanding with Cross-Modal Contrastive Learning
    Dong, Jingjing
    Fu, Jiayi
    Zhou, Peng
    Li, Hao
    Wang, Xiaorui
    INTERSPEECH 2022, 2022, : 2693 - 2697